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Preface 


About Tea Time Linear Algebra 


Greetings! And thanks for giving Tea Time Linear Algebra a read. The phrase “tea time” is meant to do more 
than give the book a catchy title. It is meant to describe the general nature of the discourse within. Much of the 
material will be presented as if it were being told to a student during tea time at University, but with the benefit 
of careful planning. There will be no big blue boxes highlighting the main points, no stream of examples after a 
short introduction to a topic, and no theorem... proof...theorem...proof structure. Instead, the necessary terms and 
definitions and theorems and examples will be woven into a more conversational style. My hope is that this blend 
of formal and informal mathematics will be easier to digest, and dare I say, students will be more invited to do their 
reading in this format. 


Those who enjoy a more typical presentation might still find this textbook suits their needs. There will be a 
summary of the key concepts at the end of each conversation and a number of the exercises will be solved in complete 
detail in the appendix. One can get a closer-to-typical presentation by scanning for theorems in the conversations, 
reading the key concepts, and then skipping to the exercises with solutions. I hope most readers won’t choose to do 
so, but it is an option. In any case, the exercises with solutions will be critical reading for most. Learning by example 
is often the most effective means. After reading a section, or at least scanning it, readers are strongly encouraged to 
skip to the statements of the exercises with solutions, contemplate their solutions, solve them if they can, and then 
turn to the back of the book for full disclosure. The hope is that, with their placement in the appendix, readers will be 
more apt to consider solving the exercises on their own before looking at the solutions. 


The topical coverage in Tea Time Linear Algebra is fairly typical, but the order of presentation is not. The book 
starts with an introductory chapter covering all the typical matrix arithmetic including inverses and eigenvalues, but 
with only one method for each computation and without accompanying theory. The second chapter begins to bridge 
the gap between computational and theoretical linear algebra, covering row operations and systems of equations, 
concluding with the first theorem of the book, existence and uniqueness of solutions of linear systems. Chapter 3 
introduces the notion of linear indepenence and revisits eigenpairs, inverses and determinants, adding depth to the 
computations of chapter 1. These three chapters conclude what might be considered the bare essentials of matrix 
algebra. Upon completion, students will be able to compute matrix sums and products, dot products, and lengths 
plus eigenpairs, matrix inverses, and determinants in multiple ways. They will be able to solve linear systems with 
any number of solutions and have enough theory to compute the number of solutions of a system without finding 
those solutions. This concludes part I, the mostly computational aspects of the course. Chapter 4 opens up part II by 
extolling the idea of abstraction. Vector spaces, basis, dimension, and isomorphisms are covered. Linear transforma- 
tions and inner products are discussed for general vector spaces, not just R”. Chapter 5 closes part II by considering 
vector spaces such as column spaces, null spaces, and eigenspaces, and extending the ideas of orthogonality, length, 
and distance to arbitrary inner product spaces. The theme of abstraction is highlighted throughout. Part II (chapters 
6 and 7) builds upon the computational aspects and theoretical notions of chapters 1-5 to solve mathematical and 
other problems, introducing unstudied theory of linear algebra sparingly. Factorization, iterative methods, geometry, 
and approximation take center stage. While these application sections largely stand on their own, the sections to 
which they refer are included parenthetically in the name of the section to help guide the reader and instructor on 
sequencing. Parts I and II do not have to be completed in their entirety before Part III is considered. 


1X 


x CONTENTS 


The first three chapters plus selections from chapters 4 and 5, capped with a smattering of chapters 6 and 7 cover 
what, at SCSU, constitutes a first semester course in linear algebra. It is likely full coverage of the book would 
require more than one semester. As this book is intended for use as a free download or an inexpensive print-on- 
demand volume, no effort has been made to keep the page count low or to spare copious diagrams and colors. In 
fact, I have taken the inexpensive mode of delivery as liberty to do quite the opposite. I have added many passages 
and diagrams that are not strictly necessary for the study of linear algebra, but are at least peripherally related, and 
may be of interest to some readers. Most of these passages will be presented as digressions, so they will be easy 
to identify. For example, the fact that determinants may be calculated by expansion along any row or any column 
is necessary basic fare for the course, but its proof is rather slippery and well beyond many students new to linear 
algebra. Its proof is therefore added as a “crumpet”. Other crumpets similarly cover technical details, but some lay 
out historical context and points for possible further study. They can be skipped without harm to the learning process, 
but are included to provide a more complete understanding of the fundamentals. In any case, each crumpet is there to 
enhance the reader’s understanding or appreciation of the subject, even if the material is not strictly necessary for an 
introductory study of linear algebra. 


Many of the computations can not be done satisfactorily with pencil and paper, so sufficient linear algebra routines 
of SageMath are introduced and discussed. While one could simply ignore the SageMath sections and exercises and 
still get something out of this text, it is my firm belief that full appreciation for the content can not be achieved without 
getting one’s hands “dirty” by doing some calculation. It would be nice if readers have had at least some exposure to 
programming whether it be Python, Java, C, web programming, or just about anything, but I have made every effort 
to give enough detail so that even those who have never written a single line of code will be able to participate in 
this part of the study. In addition to maintaining a completely free learning experience, SageMath was chosen as the 
computer algebra system for this book because it allows linking to SageCells. Each live SageCell link in the PDF of 
this book leads to a bare bones, but fully operational portal to SageMath. Most links land on prepopulated code to aid 
with the process of using computer algebra. For example, in almost all cases the matrices involved in a question will 
be coded for the student to alleviate the tedium and errors of data entry. The SageMathCell website is offered freely 
to anyone and everyone! 


As students come to linear algebra at widely varying levels of maturity, this course is not proof-based, nor does 
it require calculus. There are only 19 theorems and corollaries stated formally as such. Instead, main ideas and their 
proofs are often embedded in the course of discussion. Despite not being a proofs course, proofs are requested in 
the exercises, but usually using the word “justify” or “show”, which may be interpreted as requesting an informal 
argument for those who are not ready for full rigor. Almost never will the instructions begin “Prove...”, though 
students with rigorous proof experience are always most welcome to provide full rigor. In the end, the level of rigor 
is up to the reader and instructor. Several tips on proof technique, such as contraposition and induction, are sprinkled 
throughout the text to aid the unaccustomed reader in digesting some of the arguments, but the explanations are too 
scant to substitute as a complete course on foundations. References to calculus and exercises including integrals and 
derivatives can easily be ignored, with one exception. Section 7.3 on Fourier series necessarily requires calculus. 
Section 4.6 on inner product spaces is enhanced by knowledge of calculus, but does not require it. A corequisite 
course in calculus would suffice for section 4.6 but at least one complete semester is recommended for section 7.3. 


Chapters | and 2 form the foundation for the rest of the text and every section therein should be covered in order 
before jumping to other topics. As an instructor, you may be tempted to fill in “missing” pieces of the discussion, but 
do whatever it takes to resist. The gaps will be filled in later. The purpose of these chapters is to give a straightforward 
introduction that gently eases the student into the finer details and provide a context for deeper study. While chapters 
6 and 7 are placed at the back of the book, much of the material is appropriate long before the completion of the 
first 5 chapters. The applications within these final two chapters should be sprinkled into the course as prerequisite 
material is covered. For example, the first application, LU factorization, depends only on chapters 1-3 and can easily 
be included immediately following section 3.6, as indicated in the brackets of the section title. Bracketed lists of 
recommended prerequisite sections appear in the titles of all sections of chapters 5-7 to provide some guidance with 
sequencing. These prerequisite lists assume chapters | and 2 have been completed in their entirety, and are not meant 
to be hard and fast rules. You may find you are able to do without some of the recommendations, and you may be 
more comfortable adding others. The following table is included for further guidance. 


CONTENTS Xi 


interject this section any time after completing this section 
7.4 Discrete Dynamical Systems 3.4 
7.2 Markov Chains; and 6.2 The Power Method 3.5 
6.1 LU Factorization 3.6 
6.3 Geometry; and 7.5 Rep-tiles 44 


The remainder of the application sections (6.4, 7.1, 7.3) are best left until after the first 5 chapters, but you may find 
a way to modify the discussion of best approximation to avoid inner products and solution spaces, thus covering it 
sooner. 

No matter how you choose to use the book, I hope you enjoy your reading of Tea Time Linear Algebra. It was my 
pleasure to write it. If you spot one of the many inaccuracies that have undoubtedly evaded my watch, please let me 
know. Feedback is always welcome. 


Leon Q. Brin 
brinl1 @southernct.edu 


About the Exercises 


Exercises may be marked with one or more of the following symbols. In the PDF version of the book, each of these 
symbols is a hyperlink to the web or another part of the textbook. Click to follow. 


[S]-n This exercise has a detailed solution on page n. 

[S]-n Part of this exercise has a detailed solution on page n. 
[A]-n This exercise has an answer on page n. 

[A]-n Part of this exercise has an answer on page n. 


.)) Sadie?  SageMath is recommended or required for this exercise. 


o GeoGebra is recommended or required for this exercise. 


Answers and solutions may be followed back to the exercise by clicking the exercise number. 


New in the Second Edition 


The biggest difference between the first and second editions is the launch of an accompanying MyOpenMath course 
shell. Instructors can now assign and collect homework perfectly aligned with the textbook online. Create randomly 
generated questions for assessments. Communicate with the class. Maintain an online gradebook. Post notes to the 
online calendar. Full course management available. Most of the questions in the text have an online cousin, and most 
of the cousins contain randomly generated numbers. There are a few questions in the book not available online, and 
there are a few questions available online that are not in the book. 

The online course contains prepopulated assignments for each section including the reading for the section, in- 
structor’s notes, and a limited number of links to additional resources, all of which can easily be modified to suit your 
needs. To get started, request an instructor’s account at MyOpenMath today. It’s free to sign up and free to use. No 
charge ever. Once you have your account, you will find the Tea Time Linear Algebra course in the Course Browser 
at MyOpenMath. 
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Part I 


Matrix Mechanics 


Matrix Calculations 


1.1 Matrices 


The fundamental object of linear algebra is the matrix. A matrix is very much like a table or a spreadsheet, but 
without headings, labels, or lines. The data in a matrix are separated by space. The whole matrix is enclosed by large 
parentheses or square brackets, but is otherwise unadorned. 


Crumpet 1: Dictionary Definition 


maetrix (ma/ triks) n., pl. maetrieces (ma/ tri-séz’). Math. A rectangular array of algebraic quantities usu. delim- 
ited by parentheses or square brackets. 


The following are all matrices. 


cos@ —sing 
sin@ cosa 


i J k 

(3 ; ar) Ne 
v2 13 e 

[ .988 405 .877 .752 .541 oe 


390 595) 186) 328) «=.315 566.478 
731.224 254 543) 575 499.881 


The size of a matrix is described by its number of rows “by” its number of columns, and is abbreviated as in 
2 x 3, read “two by three”. A 7 x 5 matrix has seven rows and 5 columns. The number of rows is always listed first. 
The rows are indexed from top to bottom, and the columns are indexed from left to right. The first row is the topmost 
row, and the first column is the leftmost column. There are no restrictions on the numbers of rows or columns other 
than each must be a positive integer. The individual quantities in a matrix are called entries. The entry in the i” row 
(from the top) and ;” column (from the left) of a matrix is called the i,j-entry. The row number always precedes the 
column number. 

Matrices are most often labeled by capital letter variables such as A, B, or M. This helps distinguish them from 
numerical variables such as x, y, z, 8, or t. In this text, the i, j-entry of a matrix A is denoted A; ;. The 5,1-entry (fifth 
row, first column) of a matrix M is denoted M51, for example. 


4 CHAPTER 1. MATRIX CALCULATIONS 


Crumpet 2: Other Notations for Entries 


The subscripted lower case counterpart to a matrix variable is often used to represent the entries of a matrix. You will 
often see b, 5 or even b; represent the 1,2-entry of B. Don’t be surprised when you run into it! 


Two matrices are equal if they are the same size and corresponding entries are equal. 
Taking a cue from computer science and the currently popular Python programming language, the i” row of a 
matrix B is denoted B;,., the : indicating that all columns of the row are included. The j” column of the same matrix 


B is denoted B. ;, where the : indicates that all rows are included. 


2-6 i % 8 
IfB=|-3 4 -2 1 | thenB,,=|-3 4 -2 1 | andBy =] 1 
225 4 41 1 


A submatrix of a matrix M is any matrix derived by deleting some number of rows (less than the total number of 
rows) and some number of columns (less than the total number of columns) from M. 


> 6 2 6 1 8 
| 34 9 isasubmatrixof | -3 4 -—-—2 1 
—2 -5 4 #1 
: : ; . 2 6 1 ‘ : 
derived by deleting the last row and last column. [ 6 | is a submatrix of 34 -2/ derived by deleting the 


second row, the first column, and the third column. A submatrix derived by deleting one row and one column of a 
matrix is common enough that we use a special notation for it: B\;,; (read “B without row i and column 7”). 


2 6 1 8 
IfB=|] -3 4 -2 1 | then B23 = 


|| 
2 -5 4 1 


2 
—2 -5 1 


Though we will not make frequent use of it, the : notation can be used to identify submatrices other than single 
columns or single rows by placing a number before and a number after the colon as in 2 : 5, which means rows (or 
columns) two through five. For example, By.2,1.3 represents the submatrix of B consisting of its first two rows and first 
three columns. All other rows and columns are excluded. B2.7,3 represents the submatrix of B consisting of rows two 
through seven of column three. 

Often the entries of a matrix will have underlying meaning, derived from the context of a story problem or 
application. A table or spreadsheet of common grocery items at various stores such as 


Price Comparison ($ ) A | B [men ep m| Ee | 
Item 1 Price Comparison ($) 
eggs milk bananas or aes eggs _ milk bananas 
2 | Food Plus 2.89 | 4.69 | 2.07 3, [Food Plus 
Z| Grocer Girl [3.69 | 4.99 | 2.37 415 Grocer Girl 
Eddie’s Eats | 2.79 | 4.29 27 5 | @ |Eddie’s Eats : 


would be summarized in a matrix as 


3.69 4.99 2.37 
2.79 4.29 2.57 


All the descriptive words are stripped. Labels would only get in the way of any mathematical operation, so the rows 
and columns of a matrix are not labeled. Their meaning must be communicated some other way. View this video! 
(3:13) for more examples where a matrix might be useful. 

While the meaning of the entries in the grocery items example is stated explicitly in the table/spreadsheet, there 
are times when meaning will simply be implied or understood from context. In any case, if the numbers in a matrix 
are to have contextual meaning, that information must be supplied separately. 


[ie 4.69 22 | 


Ihttps://youtu. be/BZWFkUQ3tco?t=71 
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Key Concepts 


matrix A rectangular array of algebraic quantities usually delimited by parentheses or square brackets. Upper case 
letters are used for variables representing matrices. 


(matrix) entry One of the individual quantities in a matrix. 
(matrix) size The number of rows “by” the number of columns. 
matrix equality Two matrices are equal if they are the same size and corresponding entries are equal. 


submatrix The matrix resulting from deleting some number of rows (less than the total number of rows) and some 
number of columns (less than the total number of columns) from a matrix. 
notation A, B,...,M,... Upper case letters are used for variables representing matrices. 
A;,; The entry in row i and column j of matrix A. 
Am, Row m of matrix A. 
A., Column n of matrix A. 
A\mn The submatrix of A consisting of all entries except those in row m or in column n. 
Aj... The submatrix of A consisting of rows i through j of columns k through 1. 


m xn The size of a matrix with m rows and n columns. 


SageMath 


The matrices and operations of this section (and the entire text) can be handled electronically by SageMath. All you 
need is the syntax, the proper combinations of words and symbols. A matrix must be defined before it can be used in 
any computation. In SageMath, there are several ways to define a matrix, but we will most often use the syntax 


M = matrix(rows,cols,[list of entries]) 


The rows and cols stand for the number of rows and number of columns in the matrix, respectively. The list of 
entries must be comma-separated as in 1,2, 3. Entered into SageMath properly, this line creates the variable M 


Crumpet 3: Instantiation 


In computer science, giving a value to a variable is called instantiation. 


as a matrix from which we can extract entries or submatrices, print out, or perform operations. Just as we can on 
paper, we can name the matrix using any letter. It does not have to be M. 

Submatrices can be extracted using : notation just as we have been doing on paper. In SageMath, though, sub- 
scripts aren’t used. Square brackets are. So M3. would be written M[2,:] in SageMath. Yes, that looks like a typo. 
It is not! On paper, and in mathematics generally, we index the rows and columns of matrices in a way that seems 
most natural. The first row is row 1, the second row is row 2, and so on. However, SageMath uses the very common 
computer programming convention of 0-indexing. Counting starts with 0 instead of | in SageMath. So the first row 
of a matrix M (in SageMath, Python, and many other programming languages) is row 0, the second row is row I, and 
So On. 

The square bracket notation is used to extract entries of a matrix, too. In SageMath the i, j-entry of a matrix M is 
indicated by M[i-1, j-1]. Table 1.1 summarizes the extraction of entries and submatrices using SageMath. 

In SageMath, the lines 


M 
S 


matrix(3,2,[1,2,3,4,5,6]) 
M[2,:] 


6 CHAPTER 1. MATRIX CALCULATIONS 


Table 1.1: Matrices, entries, and submatrices in SageMath. 


Mathematics SageMath (0-indexed) 

1 2 
matrix M=|3 4 M = matrix(3,2,[1,2,3,4,5,6]) 

5 6 
row M,. M[r-1,:] 
column M.< M[:,c-1] 
submatrix || Mj. jx:1 M[i-1:j,k-1:1] 
submatrix |) M\;.< M.delete_rows([r-1]).delete_columns([c-1]) 
entry Myc M[r-1,c-1] 


define a matrix M and a submatrix S, but do not create any output. When the code is run (evaluated), it seems nothing 


has happened! Test it by following this Ive) Segeriath Cell 1. Rest assured, these lines cause SageMath to do things 
internally. We just aren’t seeing the results yet. 

If we add a couple lines requesting the display of our matrices, we will see the results. In SageMath, this can be 
done with the print (object) statement. In this case, we want to print out two matrices. A little space between 
them would be good too. Printing “nothing”, using print QO, actually prints a blank line. The following SageMath 
code creates a matrix M, a submatrix S, and prints them both with a blank line between. 


M = matrix(3,2,[1,2,3,4,5,6]) 
S = M[2,:] 

print (M) 

print(Q) 

print (S) 


Here is a screenshot of this code being processed at SageCell .SageMath.org. 


Sage Math Cell 


About SageMathCell 


Type some Sage code below and press Evaluate. 

M = matrix(3,2,[1,2,3,4,5,6]) 

S = M[2,:] A 
print (M) ! 
print() 
print(S) 


| K 


Evaluate Language: Sage v 


Share 


Help | Powered by SageMath 
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Live at .)) SageMath Cell ay 
Crumpet 4: Nested Statements in SageMath 
SageMath statements may be nested. One statement may appear as the argument (inside) of another. For example, 
the code 
M = matrix(3,2,[1,2,3,4,5,6]) 
S = WL, 2] 
print (M) 
print() 
print(S) 
might also be written as follows. 
M = matrix(3,2,[1,2,3,4,5,6]) 
print (M) 
print() 
print(M[2,:]) 
Notice the extraction of the third row of M happens inside the print statement. There is no need to produce a variable 
named S since it is not used for any other purpose. 
Exercises (b) Bi 
1. H d trix with the given size have? a ee 
. How many rows does a matrix with the given size have’ B=-| 3-30 7 
(a) 15x 6 [S]-279 -27 -48 32 
(b) 6x8 (c) P34 
47 14 -10 10 -ll 
ie) ed pu| 21 -29 -39 49 -26 
(d) 17x2 “| -22 20 12 37 44 
se : . -18 -37 -30 -42 -17 
2. How many columns does a matrix with the given size 
have? (d) M3, 
: 21 -14 43 34 
(a) 5x 10 M=| 8 -32 -3 —-20 | [S]-279 
(b) 125 [S]-279 —2 50 -24 20 
5. Let 
eevee ih Bk «6 48. 2g 5 
(d) 18x 19 -5 12 3 2 -4 -7 
: ee , : N=); -8 -12 11 -12 4 = £-3 |. What is 
3. How many entries does a matrix with the given size 
have? 3 10 9 0 -7 -10 
~~ —2 12 -9 3 <5 8 
(a) 3x13 the size of the submatrix? 
(b) 9x8 (a) Ns, 
(c) 4x 14 [S]-279 (b) MN, [A]-347 
(d) 7x6 (c) N.2 
: (d) N.3 [S]-279 
4. Identify the requested entry of the given matrix. 
(e) Mis2:4 
(a) Aza f) No:3,4:5 [A]-347 
2331 44 «-9 «45 OES 
A=| 2 6 43a 8 (g) Mss 


-22 48 -17 -48 41 (h) M42 [S]-279 
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-7 2 
-8 -ll 
1 -10 
6. Let A = 9 4 
6 -9 
2 -4 
submatrix of A. 
(a) As, 
(b) Ag; [S]-279 
(c) A.2 
(a) A,s [A]-347 
(e) A2:3,3:5 
(f) Az.42.3 [S]-279 
(g) Aya2 
(h) A\35 [A]-347 
ll 7 
-8 -6 
7. Let P= 5 24 
8 3 
or submatrix. 
(a) 5 
ll 4 
(b) | -8 1 
—2 5 
-8 -6 1 
©) | 2 <1) 4 
@[s 3 12] 
(e) -2 
11 
-8 
] 


ne 


. Identify the 


. Supply notation for the entry 


8. 


10. 


11. 


6 -7 1 
Let B ll 2 2 . Write 
-7 9 10 


SageMath code that will accomplish the following. 


(a) Create the matrix B. 
(b) Print B. 
(c) Extract B,3. 
(d) Print Bz 3. 
-4 -9 13 -li1 
QM rc=| 7 #5 -14 -2 
-12 12 8 11 
Write SageMath code that will accomplish the follow- 
ing. [A]-347 
(a) Create the matrix C. 
(b) Print C. 
(c) Extract C39. 
(d) Print C3. 


£3) SageMath Cell] 3 Add code that will print out (i) the 


third row and (ii) the first column of D. 
D = matrix(3,4,[-2,10,-7,8,-2,7, 
11,-7,-4,5,6,10]) 


What is the output of your code? 


£3) Sage ath Cell] 4 If you swap the 3 and the 4 in the 


code of exercise 10, as shown below, what is the new 
output of your code (third row and first column of D)? 
D = matrix(4,3,[-2,10,-7,8,-2,7, 
11,-7,-4,5,6,10]) 
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1.2 Component-wise Matrix Operations 


While a sudoku board is not a matrix, if we strip away the color and the lines, it certainly is a rectangular array of 
numbers, the essence of a matrix. Soon we will do just that, but for now let’s have a look at the sudoku board without 
thinking about matrices. Notice it consists of nine 3 x 3 blocks. 


Pick your favorite two 3 x 3 blocks and think about how you might add them to one another. Don’t just read on. Stop 
and think about this briefly. There are no right or wrong answers. What would be one reasonable way to add two 
blocks? If you are like most students, you probably came up with one of two ways to add the blocks. The first one 
is to add all the numbers in each block. If you did this, you should have gotten 90 for the total. Sum the numbers 
in a different pair of blocks, and you will notice you get 90 again. The sum of the 18 numbers in any two blocks is 
90. Can you see why? Answer on page 12. This way of adding is legitimate, but maybe a little unsatisfying since the 
sum is always 90. 

What if the sum of the two blocks were another 3 x 3 block? This way of thinking has a lot of precedent in 
mathematics. The sum of two integers is an integer. The sum of two rational numbers is a rational number. The sum 
of two functions is a function. The sum of two areas is an area. The operation of addition always seems to preserve 
the type of object being added. 


Crumpet 5: Operators 


In mathematics a binary operator, such as +, takes two objects (inputs or addends) from a set and produces a third 
object (output or sum) from the same set. 


With this idea in hand, perhaps the most organized way to proceed is to add the number in the upper-left corner of the 
first block to the number in the upper-left corner of the second block to produce the number in the upper-left corner of 
the sum. Similarly, the other 8 numbers of the sum can be produced by adding corresponding numbers (by location) 
of the two blocks being added. Here is an illustration of that process. 


1+7 | 4+4 | 7+6 
2+2|6+8/3+3|] = 
8+9 | 54+5 | 94+1 


The exact same component-wise (entry-by-entry) mechanics are used for adding matrices. Using matrix entry 
notation, 
if A and B are matrices, then (A + B);; = Ajj + B;; 


for all entries Aj; and B;; of A and B. That is, the i,j-entry of A + B is the sum of the i, j-entries of A and B. For 
the sum to be defined “for all entries A; ; and Bj, A and B must have the exact same size. The sum of matrices of 
differing size is undefined. Subtraction of matrices is defined analogously. 


If A and B are matrices, then (A — B); ; = Aj; — Bi, 
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for all entries A; ; and B; ; of A and B. The difference of matrices of differing size is undefined. 

All that is to say we add matrices the same way we added the sudoku blocks and we can subtract matrices in a 
similar manner. Transferring the numbers of a sudoku board to a matrix is good practice in creating matrices where 
there are none, extracting them from their context for mathematical work. Let’s start by looking at each 3 x 3 block 
of the sudoku board as a matrix. 


NYMNDABOW AONE 
FB OWNWMONINUNUA A 
ONBANHK OWN 
BRANNON NYWe 
WeENMNAOBR IODA 
CONF WANAN 
ArNownhorwnny 
AR ONF OD IW 
NwWEFIAAAUN BROW 


Previously we added the upper-left block and the middle block of the sudoku board. Now let’s add the upper-left 
matrix and the middle matrix: 


14 7 7 4 6 14+7 4+4 7+6 8 8 13 
2 6 3}4+}/ 2 8 3 |=] 24+2 6+8 34+3 }=| 4 14 6 
8 5 9 9 5 1 8+9 5+5 9+1 17 10 10 


Conceptually, it is the same computation. 
Multiplying a matrix by a number is also done component-wise. Multiplying the bottom-left 3 x3 matrix extracted 
from our sudoku board by 5 is done as follows. 


6 3 4 5:6 5-3 5-4 30 15 20 
5} 5 9 2 }=) 5:5 5-9 5-2 |=] 25 45 10 
7 1 8 5:7 5-1 5-8 35 5 40 


This is often referred to as scalar” multiplication to differentiate it from matrix multiplication, the subject of the next 
section. In symbols, 
If A is a matrix and c is a scalar, then (cA), ; = cAj,; 


for all entries A; ; of A. This means that cA has the same size as A and the i, j-entry of cA is c times the i, j-entry of A. 
To be complete Ac is defined to equal cA. 


Crumpet 6: Fields 


Sets of scalars other than real numbers and complex numbers are permissible in linear algebra as long as matrix 
entries come from the same field. A field must contain an additive identity, denoted 0, and a multiplicative identity, 
denoted 1. A field with only these two elements can be defined by treating 0 and 1 as integers except that 1 + 1 = 0. 
The field of two elements is often denoted F, or Z>. 


Key Concepts 


binary operator A function with two inputs and one output, all three from the same set. 


2In this textbook, the word scalar refers to either a real number or a complex number. In more abstract settings, the word scalar refers to any 
element of a field. 
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matrix addition For any matrices A and B of the same size, the sum A + B is defined, has the same size as A and B, 
and (A + B); ; = Aj; + B;,; for all entries A; ; and B; ;. If A and B differ in size, then A + B is undefined. 


matrix subtraction For any matrices A and B of the same size, the difference A — B is defined, has the same size as 
A and B, and (A — B); ; = Aj;,; — B;,; for all entries A; ; and B; ;. If A and B differ in size, then A — B is undefined. 


scalar An element of a field. 


scalar multiplication For any matrix A and scalar c, the scalar product cA is defined, has the same size as A, and 
(cA);,; = cA; for all entries A; ;. Moreover, Ac is defined to equal cA. 


SageMath 


The syntax for scalar multiplication, matrix addition, and matrix subtraction in SageMath is much like calculator 
syntax. The plus sign is used for addtion, the minus sign for subtraction, and the asterisk for multiplication. The 
asterisk is not optional. Typing two quantities with no operator between produces an error. Multiplication is not 
implied by lack of a symbol. SageMath code that reproduces the calculations of this section follows. 


A=matrix(3,3,[1,4,7,2,6,3,8,5,9]) 
B=matrix(3,3,[7,4,6,2,8,3,9,5,1]) 
print (A+B) 
print () 
C=matrix(3,3,[6,3,4,5,9,2,7,1,8]) 
print (5*C) 
£3) Sagelath Cell] 5. The output is as follows. 
[8 8 13] 
[ 4 14 6] 
[17 10 10] 
[30 15 20] 
[25 45 10] 
[35 5 40] 
Exercises (e3 3.43 6.59 4 -0.78 8.68 
8) | 0.96 0.16 2.14 8.79 
1. Perform the operation if possible. 
-9 1 5 —2 T 8 
[tt Oe Og th = 3 w[ i 1 -to].[ 9 -4 >| 
Be act. eh HG 9 0 2 a as, 2 -10 -10 
4.01 1.75 : -l1 6 
oy | 2& 84 Ja} 935 149 9 2| 8 is | ish279 
8.16 -0.33 ok ae 
, : (j) 4.65 1.33 8.86 _ 1.85 64 7.33 
-5 -8 7 5 3 -4 7 -8 6.03 4.56 4.8 4.58 8.39 1.89 
MP eg ca pa) a oe a og |e 
347 fg Oe: Teele = 
8 10 7 -3 6 -3 
w | 5 |-|° a | 
a: Lt 8 oo 444 6.57 
dd) | -0.48 7.82 |- 422. 717 
~6 9 0.25 2.53 
(e) O |+] 10 
-6 0 (mm) [1 -9 6 10 ]-[ -2 -1 2 -7]| 
| 


-4 
; | [S]-279 
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2 =] 3 
7 3 9 
1g |*) <4 9 
3 2 9 
10 
(p) | 4 -[-1 10 -2 | 
10 
~12.96 
-0.96 
(S19) rag 
11.05 


-6 -4 -4 -6 
o| , 2 |-| i 5 | 
=T «6 3 8 


2. Suppose M is a5 x 5 matrix and M + N is defined (the 
sum can be computed). How many entries does N have? 

3. In your own words, describe how to add or subtract two 
matrices, and explain how to determine whether the ad- 
dition or subtraction can be done. 


For the remaining exercises, let 


4a Q 47 —34 -10 <48 
8 26 43 -18 -20 -30 

Az=| -41 -40 -29 -36 -44 12 | N= 
-42 47 28 4 38 -22 
18 <15 <f 29 37 9 


-17 -37 -34 20 -14 10 
—23 44 47 18 19 49 
Q=| 11 33 35 -50 2 9 |T= 
-36 -18 7 17 -49 31 
-8 16 28 -32 -2 5 


8. 6 Compute A + Q 

9, QEESERED 7 compute 34 +47 [S]-279 
10. 8 Compute N — 5T 
11. QEESETED 9 compute 3.17(1.110 + .22N) 


Answers 


. Can a matrix with 29 nonzero entries be added to a ma- 


trix with 25 nonzero entries? Explain. [A]-347 


. Suppose M and N are matrices such that their sum is de- 


fined (M + N can be computed). Is the following true or 
false? Explain. 


M+N=N+M 


. Suppose M and N are matrices such that their difference 


is defined (M — N can be computed). Is the following 
true or false? Explain. 


M-N=N-M 


[S]-279 


. Suppose M is a matrix of size 3 x 7, c is a scalar, and the 


matrix computation cM is defined. What is the size of 
matrix cM? 


—-21 -33 28 -15 34 45 
27 40 -13 -23 -10 15 
43 -6 46 17 #13 21 
-40 -46 2 16 22 -14 
10 -12 29 35 48 -31 


40 47 13 -2 -22 3 
-45 4 -16 6 -18 8 
18 -26 -27 -19 -48 -35 
3335 9 25 2 7 
-8 10 -12 -34 11 38 


Sudoku sum: Since each block of a sudoku board is required to contain the numbers from | through 9 exactly once 
each, the sum of a single block is 1 +2 + 3 +--- +9 = 218 = 45 making the sum of any pair of blocks 90. 
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Figure 1.3.1: (AB)o 4 = Poa = Ari Bia Es A22Bo4 


Two entries P=AB 
in each row 
Aoi Ag» eG P24 
4 - + + Bia + | Two entries 
A - + + Boa + | in each column. 


aa 


a 


1.3 Matrix Multiplication 


Matrix addition, matrix subtraction and scalar multiplication are each done component-wise, something many people 
find natural. Even those for whom it does not come naturally rarely question why the operations are done the way 
they are. After explanation, they are acceptable. Devoid of context, however, there is nothing natural or intuitive 
about matrix multiplication. It’s not difficult. It just takes some getting used to. The purpose of the current section 
is to start the process of familiarization. The reason multiplication is done the way it is will not come up for a little 
while yet. In the meantime, a little patience and concentration will be enough. 

If you can master the product of a row matrix (a | x n matrix) with a column matrix (an m x | matrix), you can 
master the product of any two matrices. The following example illustrates the process. 


4 
[1 2 3 | 5 =[ 1-44+2-5+3-6 |=[ 32 | (1.3.1) 
6 


Given a row matrix R and a column matrix C with the same number of entries, say n, their product is the sum of the 
products of corresponding entries. That is, 


RC = [ ryicia +112C21 28+ + Tina | : 


The first entry of R (reading from left to right) corresponds with the first entry of C (reading from top to bottom). The 
second entry of R corresponds with the second entry of C, and so on. The product of the two matrices is the sum of 
these entry products. As with addition, multiplication is an operator, so the product of two matrices is a matrix. In 
this case, a 1 x 1 matrix, as shown in (1.3.1). If R and C differ in length the product RC is undefined. 

For matrices with multiple rows and columns, this row-matrix-column-matrix calculation is repeated for each 
entry of their product. The i, j-entry of P = AB is the single entry of the i” row of A times the j” column of B, where 
this makes sense. If A and B are matrices, then the product P = AB is calculated by setting (AB); ; equal to the lone 
entry of A;.B. ; (where this makes sense). Several conclusions can be drawn from this description. 


e The rows of A and the columns of B must have the same number of entries. Otherwise A,B. ;, is undefined. 
e Phas the same number of rows as A (P and A have the same height). 
e Phas the same number of columns as B (P and B have the same width). 


These last two observations suggest an organizational technique for multiplication. Writing B to the right of A and 
just below leaves a space above B and to the right of A that’s exactly the right size for the product P. Plus, the row 
needed for calculating (AB); ; is directly left of it and the column needed for calculating (AB), ; is directly below it. 
See figure 1.3.1. 
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Transposition and the Dot Product 


If A is a matrix, then its transpose is the matrix resulting from turning the rows of A into columns. The first row of 
the matrix becomes the first column of the transpose. The second row of the matrix becomes the second column of 
the transpose, and so on. Equivalently, the transpose of a matrix A is the matrix resulting from turning the columns 
of A into rows. The first column of the matrix becomes the first row of the transpose. The second column of the 
matrix becomes the second row of the transpose, and so on. Can you see why turning rows into columns and turning 
columns into rows are equivalent? 

If a matrix has only one row (is a row matrix) then its transpose has one column (is a column matrix), and vice 
versa. Using a superscript T for transpose the row-matrix-column-matrix product from the beginning of this section 
can be written 


4 i7Tt4 
[ 1 2 3 | 5 |=| 2 5 ]=[ aeassea6 ]=[2| (1.3.2) 
6 3 6 


Writing this way may help you keep track of which numbers should be multiplied by which since they are side by 
side in the expression using the transpose. Combining this observation with the organizational technique of figure 
1.3.1, computing the product 
—2 0 9 3 
; = : | 8 14 2 8 
1 -l 7 5 


might look (at least to start) like the following on paper. 


gc TTS 

Le wee =B mw Tol | eds 1Gd-al9)+40) = -17 
5 3 6 Qo I ‘ 

“2 0 F 3 ~ feito . : 

io ty it g Fy: | Lip i fo)-glive¥li= -30 
l-17¢ Lat [1] 


{5 | ¢ 
oe ¢(-2) +308) +601) = 20 Rat 3 
6 


For example, the —32, P;2, is calculated by taking the row directly to its left, [ 1 -2 4 i; and multiplying by the 
0 

column directly below it,} 14 |. This product is calculated to the right of the matrices and is just one of the 8 entries 
-1 

of the product. It looks like a lot of work, and it is! Not to worry, though. With some practice, you will become 

proficient and not have to write down all the individual row-matrix-column-matrix products in such detail. In fact, it 

will be very important that you acquire such proficiency. This row-matrix-column-matrix calculation sits at the core 

of linear algebra and its connection to various sciences. 

If you have seen the dot product, a very similar calculation in physics or calculus, think of the row-matrix-column- 
matrix product as the linear algebra equivalent of the dot product. 


In physics or calculus (vectors): (5,3,6)-<(0,14,-1) =5-0+3-14+6--1 = 36 
5 ]'f 0 
In linear algebra (matrices): 3 14 }=[5-04+3-14+6--1] = [36] 
6 -1 


1.3. MATRIX MULTIPLICATION 15 


It’s the same calculation! There are enough similarities between column matrices and vectors that we often use 
column matrix notation to represent vectors and call them column vectors or just vectors, and we call the row- 
matrix-column-matrix calculation the dot product. 


Crumpet 7: Row Vector 


A row matrix is sometimes referred to as a row vector and can be used to represent vectors like those in physics or 
calculus just as a column vector can. 


Thus the distinction between the two objects is blurred, but make no mistake, a column matrix is a matrix, and a 
vector is a vector. They are not the same thing. It is a convenience in linear algebra to represent vectors as column 
matrices, giving the column matrix notation two meanings, (1) a matrix, and (2) a vector. Though we try not to do this 
type of thing often in mathematics, giving a single notation multiple meanings, it happens much like words in English 
are given multiple meanings. What you can do with a ring depends entirely on what type of ring. A wedding ring 
might be worn on your ring finger, and a circus ring might contain a tiny car with two dozen clowns in it. Certainly 
not the other way around! 


Crumpet 8: Ring 


In mathematics, a ring is a set together with two binary operators that satisfy a number of properties. This is something 
you will study in abstract algebra. 


Analogously, what you can do with a one-column array of numbers depends entirely on what it represents. If it 
represents a matrix, it might be transposed or used in the solution of a sytem of linear equations. If it represents a 
vector it might be used in the dot product with another vector or plotted in the Cartesian coordinate system. 

Notice the product in equation (1.3.2) is written as a 1 x | matrix, but the same type of matrix product is written 
as a scalar in the pencil-and-paper calculation of a matrix product. This is another example of a single notation 
having multiple interpretations, indicated through context. There is no context for equation (1.3.2), so the product 
is rightfully a matrix. In the calculation of a matrix product, the result of each individual dot product will become 
an entry—a scalar, not a matrix—in the product. The square brackets are dropped. The 1 x | matrix is treated as 
if it were a scalar. In fact, | x 1 matrices and scalars are often used interchangeably, jeopardizing the distinction 
between these two objects. Again, make no mistake, a 1 x | matrix is a matrix, and a scalar is not a matrix at all. 
They are different things. It is a convenience to let | x | matrix notation (square brackets) and scalar notation (lack 
of delimiters) represent one another, whichever is appropriate for the situation. 

Can you compute the products 


1 —2 2 3 d 2 3 1 -2 |, 
3 7 || -1 0 - tO} |i: | 
Answer on page 19. Besides good practice in multiplying matrices, this example shows that 
1 -2 2 3 2 3 1 2 
a. 7 -1 0 -1 0 3.7 | 
and more importantly, therefore matrix multiplication is not commutative. Given matrices M and N, we cannot 
expect MN and NM to be equal even when both products are defined. 


Key Concepts 


row matrix A matrix with only one row. 
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column matrix A matrix with only one column. 
column vector A vector represented as a column matrix. 
row vector A vector represented as a row matrix. 


matrix multiplication For any matrices A and B, if the rows of A and the columns of B have the same number of 
entries, then the product AB is defined. Moreover, AB has the same number of rows (height) as A and the same 
number of columns (width) as B, and (AB), ; equals the lone entry of A;.B. ; for all entries (AB), ; of AB. If the 
rows of A and columns of B do not have the same number of entries, then AB is undefined. 


transpose For any m x n matrix A, the transpose of A, denoted A’, is defined as the n x m matrix with (A’);., j= aj 
for each entry aj; of A. 


vector A quantity with both magnitude and direction. 


dot product the dot product of m x 1 matrices u and v is u’v. 


SageMath 
If M is a matrix in SageMath, then M.transpose() is its transpose. The following code defines the matrix A = 
a | extracts columns 2 and 3 as column matrices, and finds the (matrix) product A‘,A:3 f<) SageMath Cell] 


4 5 6 
10. 


A=matrix(2,3,[1,2,3,4,5,6]) 

print("Matrix A:") 

print (A) 

print () 

print("Treating columns 2 and 3 as matrices:") 
print () 

c2 = matrix(2,1,A.column(1)) 

c3 = matrix(2,1,A.column(2)) 

printC"column 2:") 

print (c2) 

print(Q) 

printC"column 3:") 

print (c3) 

print () 

printC("column 2 transpose times column 3:") 
print (c2. transpose ()*c3) 


The output of this code is 


Matrix A: 
[1 2 3] 
[4 5 6] 


Treating columns 2 and 3 as matrices: 


column 2: 
[2] 
[5] 


column 3: 
[3] 
[6] 
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column 2 transpose times column 3: 
[36] 


Notice the columns are displayed as column matrices, and the product is also displayed as a matrix, using the square 
brackets. The .column() method extracts a column of a matrix as a vector, however, which is why the definitions of 
c2 and c3 explicitly take each column and feed them to the matrix() function. 

On the other hand, SageMath is perfectly capable of treating the columns as vectors, as seen in the following code 


£3) Sagelath Cell] 11. The * operator is used to compute the dot product of two vectors. 


A=matrix(2,3,[1,2,3,4,5,6]) 
print("Matrix A:") 

print (A) 

print () 

print("Treating columns 2 and 3 as vectors:") 
print () 

c2 = A.column(1) 

c3 = A.column(2) 

printC"column 2:") 

print (c2) 

printQ) 

printC"column 3:") 

print(c3) 

printQ) 

print("Dot product of columns 2 and 3:") 
print (c2*c3) 


The output of this code is 


Treating columns 2 and 3 as vectors: 


column 2: 


(2, 5) 


column 3: 
(3, 6) 


Dot product of columns 2 and 3: 
36 


Notice the notation for a vector (parentheses around a comma-separated list of entries), making it clear SageMath is 
interpreting the columns as vectors, not matrices. Also notice the dot product is displayed (and indeed interpreted) as 
a scalar, not a matrix. 


Exercises 
(d) [ -1 0 -3 | -2 | [S]-280 
1. Multiply if possible. 


(a) | 3 1 0 | 


Cw7"—_—_—_— 
Co Or NY 
—— 


(b) | 7 6 || i () | -3 2]| J, | wre 


5 
-9 2 
(c) | > | = | [S]-280 (g) [ 5.8 0.2 | | [A]}-347 
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(h) | -3 3 ‘iy | 


@ [7 8 2 4 || -6 1 6 4 | 


(k) [ 1.35 4.58 7.36 ll 3.36 0.25 1.6 | 


2. Multiply if possible. 


[3 als 3 


7 
a ee I -2 0| 
) | 0.03 -06 ]{ -03 
9) 425 5.09 || 4.6 


[ 23 45 | [S]-280 


—— 


| 

| 6.3 
[2 

| 


1 9 10 a nA 
(e) | 3 0 8 9 5 [S]-280 
3 8 10 
3. 8 


1 
0] : ia] ee 
4 
2 1 
@ | 3 lz » | tarsa7 
3 4 
6 7 4 
to | | 2 1 
papel 
o| 4 Ls 1 4 6 | 
7.94 |[ 9.98 2.91 
~ | 115 |} 1.48 8.05 
® | 288 || 641 9.67 
8.95 || 5.16 8.88 
6 [FAILS 4 Jee 
oe Ss 
4 2 
() | -1 0 | 
polls 3 
8 
mols ; 7 5 | ars 
9 
0 0 3 6 2 
Gas: @ oF est) 
i} —i a # 2 


(o) | 5 ain if 


5.53 5.89 
(p) | 3.47 -2.73 | mes | 


rae ie ae 2 | ee 
Vli9o 4 8 3//8 14 4 


3. Find the dot product u’ v. 


-11 ia 
@o=[ 73 fr=[ a | 
(c) u-| = =| ; | [A]-347 
14.3 10.3 
@ u-| 13.7 v-| 2.9 | 
10 -11 
(e) | 2 | 3 | [S]-281 
-3 -10 
2 -6 
(f) u=| -6 |;v=] —-10 
12 -4 
8 5 
(g) | -7 | -11 | [A]-347 
5 -8 
4.9 3.6 
(h) u=| 04 |;v=] 2.0 
-2.5 -4.1 
-1 -7 
(i) -| : ive 2 
2 -2 
3 0 
-3 -~2 
Gj) u=} 7 |;v=] —-1 | [A]-347 
~2 5 
-8 4 


. Redo question 3 calculating v’u instead, and compare 


your answers. [S]-281 [A]-347 


. Suppose A is a matrix of size 2 x 7, C is a matrix of size 


5 x 7, and the matrix computation A + BC is defined. 
What is the size of matrix B? [A]-347 


. Matrices A, B,C, D are such that (A + B)(CD) is defined 


(all of the operations are possible). If B is a 3 x 4 matrix 
and D isa 5x8 matrix, what are the dimensions of A and 
Cc? 


. Describe how to multiply two matrices, and explain how 


to determine whether the multiplication can be done. 


. True or false? For any column vectors u, v, and w with 


the same number of entries in each, [A]-347 
(a) wv=v'u 


(b) (t+ w)'v=u'v+w'v 


. Find a pair of matrices M and N so that MN is defined, 


but NM is not, and therefore MN + NM. 
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10. Find a pair of matrices M and N such that MN and NM (a) .)) Sagetath Cell } 12 [S]-281 
are both defined but are different sizes, and therefore —3748 —5497 
MN # NM. — —3468 |. eis 2448 
11. Find a pair of 3 x 3 matrices M and N such that MN # = ates 
-3611 —2772 
NM. 
12. Can you find a pair of distinct 2 x 2 matrices M and N (b) © SageMath Cell Bi 
such that MN = NM? —1.33017 9.21163 
; : ar 1.33699 2.87319 
13. Suppose the matrix product MN is defined (the multipli- U=1 550693 1’ ~| —9.634 
cation can be done). Which of the following is true? 9.67517 4.46961 
(a) M and N must have the same number of rows. (c) . 1) Sagetath Cell } 14 
(b) M and N must have the same number of columns. —228 —8419 
u=} —5201 };v=] —5162 
(c) The number of rows of M must equal the number —45] ~2381 


of columns of N . 


SageMath Cell 
(d) The number of columns of M must equal the num- d) @ i 


—2.6018 —7.29805 
pelateeae 5.18949 1.89209 
(e) None of the above. u=| 2.99411 |;v=| 7.33303 
7.25436 —9.41897 
14. Find the dot product, u’ v. 0.90284 0.85775 
For the remaining exercises, let 
42 QO -47 -34 -10 -48 -—21 -33 28 -15 34 £45 
8 26 43 -18 -20 -30 27 40 -13 -23 -10 15 
A=] -41 -40 -29 -36 -44 12 U=;} 43 -6 46 17 13 21 
-42 47 28 4 38 -22 -40 -46 2 16 22 -14 
18 -15 -l 29 37 9 10 -12 29 35 48 -31 
-17 -37 -34 20 -14 10 40 47 13, -2 -22 3 
-23 44 47 18 19 49 -45 4 -16 6 -18 8 
Q=] Ill 33 35 -50 2 9 |R=] 18 -26 -27 -19 -48 -35 
-36 -18 7 17. -49 31 33 35 9 25 2 7 
-8 16 28 -32 -2 5 -8 10 -12 -34 II 38 


14. 16 Compute (A7)(U) and (U)(A’). Are they equal? 
15. 17 Compute O'R and OR". Are they equal? [S]-281 


16. 18 Compute (307 - 2R)! and 3Q — 2R. What do you notice? Why? 


17. 19 Can you determine which of the following computations are defined? Ask SageMath to 
compute them all. The ones that are undefined will produce long error messages. 
(OR)’  AUTR  QUAR’  A™QUTA 
A+R’ ~~ AR™ (R— A)? 


Answers 


matrix products The products are 
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1.4 Magnitude and Orthogonality 


Geometric Interpretation of Vectors 


One day my friend Victor took a 5 kilometer drive. When Victor told me this I knew just how long his drive was. It 
was 5 kilometers. When Victor added that his drive was on a very straight highway headed due east, I knew more. 
I knew which way Victor was driving. I could imagine tracing out his path on a map by drawing a horizontal arrow 
pointing to the right (eastward) with a magnitude equivalent to 5 kilometers. The arrow captures both the direction 
and magnitude of Victor’s drive. Vectors can be imagined in the same way. The vector 


5 

0 
has a 5 as its first entry and a 0 as its second. Thinking of these entries as x- and y- coordinates, the five represents 
5 units right (eastward) and the zero represents 0 units up (northward). In this way, the vector represents both the 


magnitude and direction of Victor’s drive, just like the arrow. The vector and the arrow can be interpreted to represent 
the same thing, blurring any distinction between them. * 


5 km 
al 
3 mi ? 


The vector/arrow represents Victor’s displacement, or movement, 5 kilometers in the eastward direction. 

Notice there is no origin on the map. This is typical of drawing vectors. They are not specified relative to an origin. 
They only represent a change in location, or displacement, starting anywhere. A vector represents the locations of 
two points relative to one another. Exactly where those two points lie is not determined by the vector itself. Further 
information is needed to locate the vector. In the case of Victor’s travel, I needed to know on what road and where he 
was driving to create an accurate picture of his drive. 

After driving 5 kilometers east, Victor exited the highway and drove 3 kilometers southeast (using a road that 
does not appear on the map). When I heard this, I was able to capture this part of Victor’s journey by the vector 


- US 460 


5 km 
3m ra 
a 


a 


And I knew exactly where to put it since it started just where the previous leg left off. Drawn as an arrow, the 


vector is the hypotenuse of a right triangle with side lengths ae , which by the Pythagorean theorem gives it length 


3Street map minus vectors © OpenStreetMap contributors 
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2 2 
( 3) + ( 2) = 3. It has the right magnitude and it points southeast (and starts where the first leg leaves off), so 


it accurately represents the second leg of Victor’s drive. 
As the crow flies, Victor’s total displacement or movement for the drive is represented by the sum of the vectors, 


9 9 
a4 2 |_| 9+ V2 
0 ae au 
2 2 
the black vector in the diagram. 
: US 460 
r US 460 \. KY 1876 


5 km 
a 
/ 
3 mi / 


Since addition of vectors is commutative, it does not matter which vector is plotted first. In the diagram, the gray 


9 9 
vectors represent 9 |t YE and the blue vectors represent iE + : After the pair of displacements, 
2 2 
5 3 5+ a 
0 and iP , Victor’s total displacement is a no matter which displacement comes first. The dia- 
2 2 


gram illustrates the parallelogram rule for vector addition. The sum of two vectors is a diagonal of the parallelogram 
determined by the two vectors. 


Perpendicularity 


The magnitude of a vector, not surprisingly, is defined by the length of its representative arrow. A collection of vectors 
pointing in various directions, including vertical and horizontal are shown below. 


Regardless of which direction the vector v = . points, its magnitude is -/|x|? + |y|? or simply x? + y?. The 


Pythagorean theorem can be used to calculate magnitudes of vectors that are not horizontal or vertical. 
Coincidentally the dot product of v with itself, vy, is 


T 
Xx 


y 


x 2 2: 
=x + 
y [Ret 


so the magnitude of v can also be written as Vv’ v. This expression has a nice symmetry and is independent of the 
number of entries in v. It could apply to vectors with 3, 8, or 28 entries just as well as vectors with 2 entries. The 
magnitude of a vector v, denoted ||v||, is defined as 


lvl] = Vv? v. 
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The following diagram illustrates the relationship between the magnitudes of vectors v - u and v + u. By the 
side-angle-side theorem from geometry the pair of triangles in each figure are congruent if and only if a = B. Since 
a and £ together form a straight angle, a = £ if and only if they are both right angles. Consequently the magnitudes 
of v + wand v — ware equal if and only if u and v are perpendicular. 


vt+u 


v+u 


This observation leads to a very useful property of the dot product, exposed by the following calculation. u and v are 
perpendicular if and only if 


lv + ull = [lv — ull 


Vwt+ut(vtu) = yv—u)(v—u) 


(vtu)(v+u) =(v-u)’(v-u) 


(vo +u)\vs+u) =(v' -u (v-w) 
vvtvud¢uvewu=vv-vu-uv+wu 
vut+uv=-vu-u'v 
2v'u = —2u’v 
2v'u = -2v'u 
4v7u =0 


vu=0 (1.4.1) 


Since each line follows logically from the previous, and vice veresa, the vectors u and v (with two entries) are 
perpendicular if and only if their dot product is zero! Passing between the seventh equation and the eighth depends 
on the fact that v’u = u’y. Can you show this is true for any vectors of equal size? Answer on page 26. 

As with the formula ||v|| = Vv" y for magnitude, this calculation is independent of the number of entries in the 
vectors. We say that vectors u and v of the same size are orthogonal if and only if their dot product is zero. For 
vectors with two or three entries this means the vectors are perpendicular. As a result, orthogonality is precisely the 
same as perpendicularity in two and three dimensions, and extends the idea to dimensions greater than three. 

If u and v are placed with their tails at the same point, then ||u — v|| is the distance between the heads of u and v. 
See the diagram above. As such, the distance between u and v, denoted d(u, v), is defined as ||u — v||. Easy to picture 
in two dimensions, this formula applies to vectors of any magnitude again extending a two- and three-dimensional 
notion to higher dimensions. 


Key Concepts 


geometric interpretation of vectors Vectors are often thought of as displacements represented by arrows. 


geometric interpretation of vector sum The sum of two vectors is represented geometrically by a diagonal of the 
parallelogram determined by the two vectors. 


magnitude of a column vector v, denoted |[v||, is the square root of the dot product of v with itself, Vv" v. 
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orthogonal Two vectors whose dot product is defined and zero are orthogonal. 


distance The distance between two vectors, d(u, v), is the magnitude of their difference, ||u — y||. 


SageMath 


SageMath distinguishes between vectors and matrices, but just like in mathematics the distinction is blurry. The 
SageMath code 


u=vector([1,2,3]) 
v=matrix(3,1,[3,2,1]) 
print (u*v) 


£3) Sagelath Cell] 20 runs even though the third line requests the product of a vector with a matrix. SageMath treats 
matrix v as if it were a vector, sort of. The output of the code is 


(10) 


a vector with one entry—-not a scalar and not a | x | matrix. If v is defined as a vector as in the following code, the 
output is the scalar value 10, not a vector. 


u=vector([1,2,3]) 
v=vector([3,2,1]) 
print (u*v) 


£5) Sagelath Cell] 21 produces 
10 


SageMath’s internal process of converting one type of variable to another to avoid throwing an error, a process called 
coersion, can produce unanticipated results. More predictable results are obtained by explicitly converting one type 
of variable to another. The SageMath code 


u=vector([1,2,3]) 
v=matrix(3,1,[3,2,1]) 
print (u*vector(v)) 


£)) Sagelath Cell] 22 explicitly tells SageMath to treat v as a vector in the computation of the product so no coersion is 
needed, and it produces 


10 


just as if v were defined as a vector in the first place. 

Any row or column matrix can be converted to a vector the same way. In fact, vectors can be converted to row 
or column matrices just as easily. The following code converts u to a matrix (instead of converting v to a vector) and 
then computes the dot product. 


u=vector([1,2,3]) 


v=matrix(3,1,[3,2,1]) 
print (matrix(1,3,u)*v) 


£3) Sagelath Cell] 23 produces the 1 x | matrix 


[10] 
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since the multiplicands are both matrices. Be aware that vectors and matrices are not equivalent in SageMath. Unex- 
pected results may be seen when the two types are intermingled. To avoid surprises, convert one to the other explicitly 
as needed. 

The magnitude of a vector can be computed using the .norm() method. Consistent with the developing theme, 
the .norm() method can be applied to either matrices or vectors, and the results are different! The following code 
defines the “same” vector as both a SageMath vector and a SageMath matrix and then outputs their magnitudes, or 
norms. 


u_vec=vector([6,5,-3]) 
u_mat=matrix(1,3,[6,5,-3]) 
print (u_vec.norm() ) 

print Cu_mat.norm()) 


£3) Sageath Cell] 24 produces 


sqrt (70) 
8. 366600265340756 


The norm of a vector is computed symbolically while the norm of a matrix is computed as an approximate decimal 
equivalent. ¥70 = 8.366600265340756. 


Exercises (a) u-| 2 jw-| 2 | 
1. Calculate |lull. 
-11 -11 
© a-[ 3 fee] | 
id 
(c) u-| — je| : | [A]-347 
-11 3 9 
oF | 
3 aju-| 43 _f 103 
2) P10" Vee pai (m=) ia9 Y=) 39 
© u-| 3 | tar | [3 
143 (e) u=}] 2 |;v= 3 | 
@) u-| “137 | -3 -10 
10 7 a 
(e) | 2 | (f) u=| -6 |;v=] -10 | [S]-282 
3 12 4 
; (g) u= i ;V= S 
(f) | -6 | [S]-282 aceel  a e 
12 
4.9 3.6 
8 (h) u=|] 04 |;v=| 2.0 | [A]-347 
(g) u=| -7 22:5 =d,1 
5 
-1 =i 
4.9 7 -5 
(h) u=| 04 | [A]-347 @u=l| 9 BY=] _4 
25 2 2 
=) 3 0 
(us|? 7 z 
0) Gg) u=| 7 [v=] 3 [A]-347 
9 -2 5 
3 -8 -4 
-3 3. Are u and v orthogonal? 
Gj) u=}| 7 | [A]}-347 5 5 
-2 (a) u= es fool 
-8 3 


2. Calculate d(u, v). 


ool he 


11 
13 
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(j) u= 


= 
an aa 
a 
a7 


7 |;v= 


-3 
-8 


3.6 
2.0 


4 
| 5 | [A]-347 


10.2 
8.7 


=i 

: | 

~9 
=0 | [S]-282 
5 
12 
=A 


[A]-347 
=4,1 


-7 
=5 


-1 | [A]-347 


4. Find k so that the vectors are orthogonal. 


(d) 


(e) 


(f) 


| 
| 
| 
| 
| 


-3 
6 


and 


-12 


k 


5 


+ 
+ 


-7 
k 


-10 


k 


-4 


6 


| 


| [S]-282 


| 


[A]-347 


5. Find the sum of the vectors. 


25 
@) |, 
4 [-—— 
3 
0 9 ] Wo 12 «13 «14 
P I 
() |, 
4 
3 
0 W #12 «13 «14 
= I 
@ [, 
4 
ro) 
0 9 a Wo 12 «13 «14 
4 I 
[S]-282 
-6 9 ; 
6. Letu = “122 and v = 5 . Find a column vector 
w such that 


(a) dtu+v,w) = 1 
(b) d(u+v,w) = V2 
(c) du+w,v) = 1 [S]-282 
(d) d(v+w,u) = 1 
HINT: Make a sketch. 


7. What conditions on a column vector u will make u’u 


zero? 


8. Give an example of a 5 x 1 column vector u and 2 x 1 
column vector v such that the magnitude of u is less than 


the magnitude of v. 


uy vi 
9. Letu=] uw |andv=] v2 | and set 


| 


v3 


U2V3 — U3V2 
w= U3V, — Uy, V3 


UyV2 — U2V, 
(a) Calculate u’ w. 
(b) Calculate v’ w. 


(c) Are u and w perpendicular? 


(d) Are v and w perpendicular? 


26 CHAPTER 1. MATRIX CALCULATIONS 


10. Suppose u and vy are orthogonal. 


(a) Are 3u and 4y orthogonal? u = vector([3.9,7.2,-8.4,-11.8,.5, 
-11.0,-9.5,8.6]) 
(b) Are —12.1u and 0.12v orthogonal? [S]-283 v = vector([-10.0,10.7,7.1,-6.6, 
ll. 25 Add code that will calculate the PERS) 
norm of the third column of D (treated as a vector). [S]- 
aa (a) [hull 
D = matrix(3,4,[-2,10,-7,8,-2,7, 
{1,=7,,2455,6) 101) ©) 405%) 
What is the output of your code? (ey es 
12. 26 Add code that will calculate What is the output of your code? 
Answers 
uy VI 
u2 v2 
dot product equality Lettingu=] | | andv= 
Un Vn 
is 
Uy Vj 
u2 v2 
uv=| . _ | = ayy + ugg +++ + UnVp 
Un Vn 
and 
T 
VI uj 
T v2 u2 
vu=| ., 2 | SVU + V2U2 +++ + Vy_Uly. 
Vn Un 


Since multiplication of scalars is commutative, these expressions are equal. 
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1.5 The Determinant 


b* — 4ac is “the discriminant”, but why? Each quadratic function, p(x) = ax” + bx + c, has two real roots, one 
(repeated) real root, or two complex roots. The discriminant discriminates between which quadratics are which. If 
the coefficients a, b,c, of a quadratic function are such that b? — 4ac > 0, then the quadratic has two real roots (and 
no others). If the coefficients are such that b* — 4ac = 0, then the quadratic has one real root (and no others). If the 
coefficients are such that b* — 4ac < 0, then the quadratic has two complex roots (and no others). In this way, the 
quantity b? — 4ac associated with the quadratic function p(x) = ax? + bx + c determines what type of roots p has. 
It is determinative of the type of roots, and in this light might just as well be known as a determinant (which means 
determinative). In mathematics, though, the term determinant is reserved for linear algebra. The determinant is a 
determinative calculation that can be made for any matrix much the same way the discriminant is a determinative 
calculation that can be made for any quadratic function. Exactly what the determinant determines will have to wait a 
short while. 

The determinant of an mXn matrix is undefined if m # n, so determinants are calculated only for square matrices, 
those with the same number of columns as rows. The determinant of a | x 1 matrix is its lone entry. That is, the 
determinant of | a | is a. As such, the determinant is a scalar. The notations det A or |A| are used to denote the 
determinant of the matrix A. 

The determinant of a square matrix with more than one row, and therefore more than one column, is defined 
recursively. If A has n rows and n columns, n > 1, then* 


det A = (—1)'*!Ay 4 det Ay.) + (-1)'*7A1 2 det A\i2 +--+ + (-1)"*"A),, det A\in- (1.5.1) 


-12 49 -45 -10 
28 45 -46 23 


For example, if A = 15 _28 4 _48 , then 
=] 34 =38 <i8 
a 7 = 45 -46 23 28 -46 23 
detA = ==-12| -28 4 -48|-—49|/-15 4 =48 (1.5.2) 

aly tee i ae 34 -38 -18 =i =—38 =18 

=] 34 =—38 =i8 
28 45 23 28 45 -46 
=A5| 15 =28 -=48 410) <15 28 4 
={ 34 =18 -1 34 -38 


The determinant of the 4 x 4 matrix is written in terms of the determinants of four 3 x 3 matrices, one application 
of recursive formula (1.5.1). To this point, the computation is not so bad. It would take a minute to write down this 
quantity by hand. However, you might feel no closer to the final result, which is —393, 294, than before. Now there 
are four separate determinants to determine. To continue the computation, the determinant of each 3 x 3 matrix would 
be written in terms of the determinants of three 2 x 2 matrices, a second application of formula (1.5.1). Thus the 
determinant of A would be written in terms of twelve 2 x 2 determinants. A final application of formula (1.5.1) would 
yield the determinant of A in terms of twenty-four 1 x | determinants (scalars), at which point the arithmetic could be 
done and the determinant determined. Hopefully you are convinced that completing this calculation by hand would 
take a while and be prone to error. 

The main point of this discourse is to familiarize you with the recursive definition. Making sure you get the right 
signs on the coefficients and extract the right submatrices at each step takes some practice. Can you use formula 


(1.5.1) to find det( i h Answer on page 32. 


2 
-1 3 
The quantities (—1)'*/ det A\;,; of formula (1.5.1) are called cofactors. More generally, the quantity (—1)'*/ det A\;,; 
is called the i, j-cofactor of A. Cofactors can be computed for any row-column combination. Using the notation C;,; 


for the i, j-cofactor, recursion (1.5.1) can be rewritten 


detA = AyiCis + Ai2Ci2 Ste te AinCin. (1.5.3) 


“Formula (1.5.1) can be made to work for 1 x 1 matrices by defining det A\;,; = 1 fora 1 x 1 matrix A. 
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While more succinct, this presentation hides the details of the calculation. Each C;,; may be an involved calcuation 
itself. 


The expression 


45 -46 23 28 -46 23 
—12} -28 4 -48 |-49) -15 4 —-48 
34-38 —-18 -1 -38 -18 
28 45 23 28 45 -46 
—45) -15 -28 -48 |+10] -15 -28 4 
-1 34 -18 -l1 34 -38 
from calculation (1.5.2) is an example of a linear combination. It is the sum of scalar multiples of matrices. More 
generally, if S is any set of objects on which addition and scalar mutliplication are defined, cj, c2,..., Cy are scalars, 
and objects bj, b2,...,b, are in S, then the expression 


cb + cob2 5 aaa Cndn 


is called a linear combination of the objects b,, bz,...,b,, and cj, C2,...,Cn are called the coefficients of the linear 
combination. 


Crumpet 9: Linear Combinations 


Linear combinations appear in many contexts. 


e A polynomial in f is a linear combination of the monomials 1,1,7°,f,..., 1”. 
e A Riemann sum is a linear combination of certain values of a function. 


e The solutions of the differential equation y” — 4y’ + 3y = 0 are linear combinations of the functions e* and e**. 


1 


e Numerical approximations of derivatives, such as — 2 f(%o) + < fG@l= a 


of certain values of a function. 


Ff (xo +2h), are linear combinations 


e The left-hand side of the equation 3x — 2y = 7 is a linear combination of the variables x and y. 


e The expected value of a random variable with finitely many possible values is a linear combination. 


Addition and scalar multiplication are defined for objects such as functions, variables, numbers, integrals, vectors, 
and matrices. Each of the following is a linear combination. 


4 
3 sin(x) — 2 sin(2x) + sin(3x) 7x +2y- 5% 


1 2 3 4 
6V2-2V7 { f(x)dx + { f(x)dx + { f(x)dx + { f(x)dx 
0 1 2 3 
1 1 2 6) 112. 9 
eee 29 2 foals S| 


Can you think of other places where you’ve seen linear combinations? 


Sudoku Row Linear Combinations 


If you enjoy solving sudoku puzzles, give this one a shot before reading on. Answer on page 32. 
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Can the third row of the 2,1-block of the completed sudoku board be written as a linear combination of its first 
two rows? Maybe you feel there is a natural way to understand linear combinations of rows of a sudoku puzzle 
and maybe not. It is not something done in solving the puzzle. However, if we cast each sudoku row as a | x 3 
(row) matrix, operate on the row matrices and then cast back to sudoku rows, it would be as if the sudoku rows 
themselves were being added. In mathematics, we might say the sudoku rows inherit the operations of addition and 
scalar multiplication from the corresponding operations on matrices. 

For example, 


1/4|7|+|2]6| 3 | 

J! (casting to matrices) 
[1 4 7]+[2 6 3]=[3 10 10] 
(casting to sudoku rows) |} 


Scalar multiplication on sudoku rows is inherited in the same manner. With addition and scalar multiplication inher- 
ited, linear combinations are inherited. Back to the question. . . 

Can the third row of the 2,1-block of the completed sudoku board (on page 32) be written as a linear combination 
of the first two rows? Rephrasing, does the following equation have a solution? 


2\-|8|4|5 


3|7]1)+5)9 


8 


Casting the equation in terms of matrices and solving: 
a[3 7 1]+6[9 6 2]=[8 4 5] 
| 3a+9b 7a+6b a+2|=|8 4 5 | 


For these two row matrices to be equal corresponding entries must be equal. That is, the simultaneous equations 


3a+9b=8 
Ja+6b=4 
a+2b=5 


must all be true. The second and third equations can be solved (as a system) by elimination, for example. The second 
equation minus 3 times the third equation yields 4a = —11, soa = +. Substituting into the third equation yields 
= + 2b = 5 which means b = at These values of a and b constitute the only simultaneous solution of the second 


aa third equations. Substituting into the first equation yields 3 (=) +9 (3) = 8 which can be confirmed FALSE! 
Therefore there is no solution. There is no way to write the third row of the 2,1-block as a linear combination of the 
first two rows. 

By contrast the third row of the 1,3-block can be written as —1 times the first row plus 2 times the second row. 
The third row is the linear combination of the first two rows with coefficients —1 and 2. Can you verify this? The 
1,3-block is 
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Answer on page 32. 
Through the process of inheritance the determinant of any 3 x3 sudoku block can also be calculated. For example, 
the determinant of the 2,1-block is 
2). 7 9 2 9 6 
5 8 5 8 4 


= 3(6:5-2 4) = 79-5 — 2-8) +1004 —6+8) 
= 3(22) — 7(29) + 1(-12) 

= 66 — 203 — 12 

= —149 


+1 


What is the determinant of the 1,3-block? Answer on page 32. 

So, for the block with determinant —149 there was no way to write the third row as a linear combination of the 
first two, and for the block with determinant 0 there was a way to write the third row as a linear combination of the 
first two. This bears further investigation, requested in the exercises. 


Key Concepts 
coefficients The scalar quantities of a linear combination. 
cofactor A scalar quantity denoted C;,;, computed for the matrix A as 
Cij = (-1)* det Ay; (1.5.4) 
determinant The determinant of an n x n matrix A, denoted det A or |A|, is defined by 
detA = AyiCy 1 + Ai2Ci2 +++ + AtnCin 
forn > 1 and detA = A, for n = 1. The determinant of an m X n matrix is undefined if m # n. 


linear combination An expression of the form 
n 
cb + cob tere t CnDn =  cibi 
i=l 
where cj, C2,...,Cy, are scalars and b,, b2,...,b, are objects from a set on which addition and scalar multipli- 
cation are defined. 


square matrix A matrix with the same number or columns as rows. Ann X n matrix. 


SageMath 


If Mis a matrix in SageMath, then M.determinant() is its determinant. The following code computes the determi- 
-12 49 -45 -10 
nant of A = ee Te , the matrix behind calculation (1.5.2). @ SageMath Cell aig 
-15 -28 4 -48 
-1 34. -38 —-18 


M = matrix(4,4,[-12,49,-45,-10,28,45,-46,23,-15,-28,4,-48,-1, 34,-38,-18]) 
print (M.determinant()) 


The output of this code is 
-393294 
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Exercises 


1. Use formula (1.5.4) to write the cofactor as a determi- 


nant. 


(a) C\, of 
(b) C2 of 
(c) Crt of 


(d) Cy. of 


(f) C21 of 


(g) C22 of 


| 
| 
| 
| 
© Gs “| 
| 
| 


(h) C32 “| 


-9 3 
0 6 
9 -6 
11-9 
ah 26 
St | [S]-284 
8 13 
aa | [A]-348 
2 = 6 
-5 -10 11 | [S]-284 
9 8 O 
3: 4 
3-5 7 | [A]-348 
4 11 +7 
=12 <2 =10 
0 3 
S ad 
1 =6. 
ae en 
2 0 4 


2. Calculate the determinant if possible. 


(a) detA where A = [ 30 | 


(b) Al where A = | -6 | 


(c) detA where A = | —45 | [S]-284 


(d) det A where A =| 44 | [A]-348 


5 -2 
(e) ae 7 9 
-18 19 
Ole | 
18 5 
(g) | oe | [S]-284 
-l1l1 2 
(h) ae( 10 -7 [A]-348 
-5 2 -4 
G) | 9 O -2 
-6 8 4 
0) 9 7 
g) | -1 -6 -2 
6 -9 -5 
-3 -1 -9 
(k) det} 1 -4 -8 | [S]-284 
2 9 6 
3 -6 O 
qd) | -8 2 -7 | [A]-348 
5 1 -l 


(m) det 


= 
DNDN OC 


(n) 


N oN A 


(0) det [S]-284 


2 
4 8 6 -2 
0 -1 6 0 
0 3 -1 3 
3. Formula (1.5.1) reduces the calculation of the determi- 
nant of a 4 x 4 matrix into a linear combination of the 
determinants of twenty-four 1 x 1 matrices, as in calcu- 


lation (1.5.2). 


(a) Formula (1.5.1) reduces the calculation of the de- 
terminant of a 5 x 5 matrix into a linear combina- 
tion of the determinants of how many 1 x 1 matri- 
ces? 


(b 


7 


Formula (1.5.1) reduces the calculation of the de- 
terminant of an n Xn matrix into a linear combina- 
tion of the determinants of how many 1 x 1 matri- 
ces? 


4. True or false? [A]-348 


(a) The determinant of a matrix is a scalar. 


(b) The determinant of a matrix is always positive 
since it is the absolute value of a number. 


(c) The determinant of a 5 x 6 matrix can be written as 
a linear combination of the determinants of thirty 
1 x 1 matrices. 


(d) If A and B are 1 x 1 matrices, then det A + det B = 
det(A + B). 
(e) If A and B are 2 x 2 matrices, then det A + det B = 
det(A + B). 
5. Compare and contrast (i) scalar, (ii) 1 x 1 matrix, and (iii) 
the determinant of a 1 x 1 matrix. 
6. Calculate the determinant. HINT: Despite the large sizes 


of some of the matrices, this does not require a lot of 
work. 


7 0 
@|§ S| 
20 0 
(b) | -7 4 «0 
2 -9 7 
4 0 0 0 
=) <6: 
Ol. 7 60 
5 -7 3 9 
-1 00000 
7-30 000 
6 -43 0 00 
@O).1 5 1-20 0 
4-5 8 3 4 0 
8 9 7-909 
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7. In your own words, draw a conjecture based on the cal- 


culations of question 6. 


8. 28 Calculate 


(a) detA 
(b) det B 
(c) det(AB) 
(d) det(3A) 
(e) det(3B) 


What do you notice? 


11. 


The remaining exercises refer to the completed sudoku board of 


this section (page 32). 


12. 


9. & Saas! 29 The determinants of the 1,3-block 
and the 2,1-block are 0 and —149 respectively. Find the 
determinants of the remaining 7 blocks. 


Answers 


determinant: 


sudoku: 


linear combination: 


sudoku determinant: 


10. 3) Sagelath Cell igh vermis 1,3-block, the third row can 


be written as a linear combination of the first two (-1 
times the first row plus 2 times the second row). For 
the 2,1-block, the third row cannot be written as a linear 
combination of the first two. For the remaning 7 blocks, 
explore whether there is any row that can be written as a 
linear combination of the other two. [A]-348 


Make a conjecture about the connection between deter- 
minant and the possibility of writing one of the rows of a 
block as a linear combination of the others. 


£3) Sageath Cell] 31 Can any of the 9 rows of the sudoku 


board be written as a linear combination of the other 8? 
Apply your conjecture from question 11 to answer the 
question. 


det( 7 : = (-1)'*1(2) det(3) + (-1)'*7(4) det(-1) 


= 2(3) -4(-1) 
= 10 


(-)[ 1 9 8 |+2| 4 6 5 |=| -1+8 -94+12 -8+ 10 | 
=| 4 3 2. 


sa RR 


Ww oO oO 


N MN oO 


6 5 4 5 
Zale 3 |-9] 3 2 


+8 


4 6 
7 3 


=1(6-2-5-3)-9(4-2-5-7)+84-3-6-7) 
= 1(-3) — 9(-27) + 8(-30) 


= -3 + 243 - 240 
=0 
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1.6 Matrix “Division’’ 


Given a message such as “Hello World!”, a system for converting letters and symbols to numbers can be used to turn 
the message into a list of numbers. These numbers can be further disguised by multiplying by an encoding matrix, 
giving a new list of numbers, an secret message! The following list of numbers was created this way. 


-199 -78 14 -273 -145 -13 -572 -294 -49 -245 -127 -7 -150 -84 -1 
-389 -174 -10 412 272 103 -142 -59 16 -231 -132 -33 


Can you decode it? Learning how to decode a message like this is the topic of this section. 

You may have heard the claim “there’s no such thing as subtraction—it’s just adding the opposite” or something 
like it. There is a vital concept of linear algebra buried in this addage. The link between addition, opposites, and zero 
that makes subtraction optional is a well known property of real numbers. The sum of opposites is zero. 

Why zero, and not some other number? Zero is that special number that can be added to any number without 
changing its value. There is no other! In symbols, a + 0 = 0 + a = a. This property is so special it has a name. Zero 
is the additive identity for real numbers—the word identity to signal this special property and the word additive to 
document the operation. The additive inverse, or opposite, of a real number is defined by the fact that adding the 
two yields the additive identity. Two numbers are additive inverses (opposites) if and only if their sum is the additive 
identity (zero). 

Likewise, one is that special number that can be multiplied by any number without changing its value. Conse- 
quently, one is the multiplicative identity for real numbers. In symbols, a- 1 = 1 - a =a. The multiplicative inverse 
(reciprocal) of a real number is defined by the fact that multiplying the two yields the multiplicative identity. Two 
numbers are multiplicative inverses (reciprocals) if and only if their product is the multiplicative identity (one). 

The link between multiplication, reciprocals, and one is analogous to the link between addition, opposites, and 
zero. For any real numbers a and b, 


aand bare reciprocals if and only ifa-b=b-a=1 
and 
a and b are opposites if and only ifa+b=b+a=0. 


The same relationship holds among addition, opposites, and zero as holds among multiplication, reciprocals, and one. 
Addition and multiplication are operations, opposites and reciprocals are inverses, and zero and one are identities. 
There is an important analogy for matrices. To see it, compute the following products. 


Fee EA fs 4, f]fore 

0 1 || -8 i | 
100 0]{ V5 =a 183 
0100}, 2 34 4Vi9 ino 
00 1 O0}f = 034 ~ e7 — sin(l) 
000 1 2” tan!(1) -12~—«1.9!000 


Answers on page 39. Hopefully these exercises have led you to the conclusion that multiplying by matrices such as 
1 0 0 0 
1 0 ; : : 0 1 0 0 
0 1 001 0 0 1 0 
000 1 


leaves the multiplicand (the matrix being multiplied by it) unchanged. Multiplying a matrix by matrices such as 
these does not change the value of the matrix, the exact same property that made 1 the multiplicative identity and 0 
the additive identity for real numbers. By extension, that makes these matrices identity matrices. They can each be 
multiplied with any other matrix (as long as the product is defined) without changing the other’s value. 

The n x n identity matrix is denoted J,,., or just J when the size of the matrix is known or unimportant. Identity 
matrices have ones on the main diagonal, the diagonal running from the 1,1-entry through the n,n-entry, and zeros 
elsewhere. In symbols, J), = 2 = +++ = Inn = 1 and J; = 0 whenever j # k. For any matrix M,M-T=I1-M=M. 
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With an identity, or really set of identities, for multiplication, we are only one element shy of the operation, 
inverse, identity triumvirate for matrices—the inverse. Compute the following products. 


0 0 3 #1 3. =9 =3. =5 


Answers on page 40. These exercises demonstrate that there are pairs of matrices A and B such that AB = J. But what 
about BA? In our formulations for real number inverses we had a+b = b+a = Oanda-b = b-a = 1. Unfortunately we 
observed in section |.3 that matrix multiplication is not commutative. We cannot immediately conclude that BA = I 
just because AB = J. Will we get lucky, though? 


Compute the following products (the same as above only in the opposite order). 


—2 5 16 8 -l -2 
5 -ll -36 3 1 1 


4 29 <3 4p 1 0 =3- 8 


2 are A 1 -2 -7 |[-4 5 3 
5 3 


-l 3 1 2 —2 -3 10 5 
3° =9° =3 =5 0 oO 3 1 


As you were hopefully able to verify, all of these products are identity matrices too! It seems that multiplicative 
inverse pairs commute. That is, if AB = J, then BA = J. We finally have evidence that a matrix analogy for linking 
multiplication, inverses, and identity matrices exists. 


For any matrices A and B, 


A and B are inverses if and only ifA-B=B-A=TI. (1.6.1) 


Crumpet 10: Inverses of non-square matrices? 


Suppose A is an m X n matrix and there are matrices L and R such that LA = J and AR = J. By theorems 5 and 6 
A must have a pivot position in every row and every column (see section 2.2 for a definition of pivot position). The 
only way that can happen is if A is square. Hence a matrix with left and right inverses must be square. 


This is enough that we could define A as the multiplicative inverse (or reciprocal) of A and have the understanding 
I 
that A + B means A- 3 much like we have for real numbers, but by convention we do not! For one thing, we would 
ra I : ‘ eee : I 
need a second division symbol to mean 3. A since matrix multiplication is not commutative. In general, 3. A and 


A-— could be unequal. Instead, we stick with the addage that “there’s no such thing as matrix division—it’s just 


multiplying by the inverse”. The notation we use for the inverse of A is A~', borrowing from the algebra of real 
numbers but not using division bars or division symbols. 
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A Formula for the Inverse (proven in section 3.7) 


For any matrix A, if A is invertible then 


Cit Car ces Cnt 
a. 1 | Giz Coa Ome Hes) 
detA} : oo: oo, : = 
Cin Can ct Cnn 


where the C;,; are the cofactors of A. This implies that when A™! exists 
1. A must be square since det A and the C;,; are undefined if A is not square, and 
2. det A must be nonzero since division by 0 is undefined. 


When A! is defined, we say that A is invertible. The matrix 


Cy Car os Cay 
Ci2 Coz ++ Crp 
Cin Can pies Cran 


is called the adjugate of A, adjA. With this definition, the formula for the inverse can be summarized as 


1 
Als adjA. 
detA 


One Property of the Inverse 


Multiplication by a matrix’s inverse “undoes” multiplication by the matrix just as dividing by a number undoes 
multiplication by that same number. In symbols, if A and B are matrices and B is invertible (has an inverse) 


(AB)B!=A (1.6.3) 


much like (a- b) + b = a for real numbers. If we used division in linear algebra, the equation (AB)B™! = A might 
be written (A - B) + B = A, making the comparison clearer. The only potential harm in thinking with division is that 
(BA)B™! is generally not A, so (B- A) + B # A for matrices even though (b - a) + b = a for real numbers. Since 
multiplication of matrices is not commutative, right-multiplication by B~' does not undo left-multiplication by B. 
Using the notation B~! and paying close attention to right-multiplication versus left-multiplication will help keep this 
straight. 


; 1 2 3 -4 : 
To illustrate, let A -| 3 4 and B -| 5 7 | making 
-7 10 
a -l1l 16 | 


Can you verify this? As seen earlier, B7! = | : : | Compute (AB)B™! to see that it equals A, and compute B-!(AB) 


to see that it does not equal A. Answer on page 40. 


Inverses and Cryptography 


The ability to undo multiplication by an invertible matrix makes it possible to use matrices and their inverses for 
encrypting and decrypting messages. Decoding messages like the one opening this section amounts to regrouping 
the code into column vectors and multiplying by the decoding matrix—the inverse of the coding matrix. As long 
as the parties on either end of the message transmission have one matrix of some pair of inverse matrices, they can 
each encode with their matrix, send their message securely, and decode received messages. Without knowledge of 
the coding or decoding matrix, an intercepted message would be very difficult to decode! 
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Table 1.2: ASCII (American Standard Code for Information Interchange) characters 


Dec Hex Char Dec Hex Char Dec Hex Char | Dec Hex Char 
(0) (0) NUL (null character) 32 20 (space) 64 40 @ 96 60 ~ 
1 1 SOH (start of heading) 33 21. | 65 41 A 97 61 a 
2 2 STX (start of text) 34 22 * 66 42 B 98 62 b 
3 3 ETX (end of text) 35 23 # 67 43 C 99 63 Cc 
4 4 EOT (end of transmission) 36 24 §$ 68 44 D 100 64 d 
5 5  ENQ (enquiry) 37 25 % 69 45 E 101 65 e 
6 6 ACK (acknowledge) 38 26 & 70 46 =F 102 66 f 
i 7 BEL (bell) 39 27 ' 71 47 = G 103 67 g 
8 8 BS (backspace) 40 28 ( 72 48 4H 104 68 h 
9 9 HT (horizontal tab) 41 29 ) 73 #49 *#OT 105 69 i 
10 A LF (line feed) 42 2A * 74 4A J 106 6A j 
11 8B VT (vertical tab) 43 2B + 75 4B K 107 6B k 
12 C FF (form feed) 44 2C , 76 4C L 108 6C UL 
13 =D CR (carriage return) 45 2D - 77 4D M 109 6D om 
14 +E SO (shift out) 46 2E ‘ 78 4E wN 110 6E on 
15 =F Si (shift in) 47 2F / 79 4F O 111 6F o 
16 10 DLE (data link escape) 48 30 0 80 50 P 112 70 p 
17 11 DC1 (device control 1) 49 31 1 81 51 Q 113 71 @ 
18 12 DC2 (device control 2) 50. 32 2 82 52 R 114 72 
19 13 DC3 (device control 3) Sl 33 2 83 53 S 115 73 § 
20 14 DC4 (device control 4) 52 34 4 84 54 T 116 74 ¢t 
21 15 \NAK’ (negative acknowledge) 58 35 5 85 55 U 17 75 w 
22 16 SYN (synchronous idle) 54 36 6 86 56 V 118 76 v 
23 17 +.§|ETB” (end of transmission block) 55 37 7 87 57 W 119 77 w 
24 18 CAN (cancel) 56 38 8 88 58 xX 120 78 x 
25 19 EM (end of medium) 57 39 9 89 59 Y 121 79 y 
26 1A SUB (substitute) 58 3A : 90 5A Z 122 7A Zz 
27 1B ESC (escape) 59 3B ; 91 5B f[ 123 7B f 
28 1c FS (file separator) 60 3C < 92 5¢ \ 124 7c | 
29 1D GS (group separator) 61 3D = 93 5D ]j 125 7D } 
30 iE RS (record separator) 62 3E > 94 5E A 126 7E =~ 
31 1F US (unit separator) 63 3F ? 95 5F _ 127 7F (delete) 
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All we need now is a method for converting letters and symbols to numbers and numbers back to letters and 
symbols. While a basic conversion from letters to numbers would have each letter of the alphabet assigned a number 
from 1-26 or 0-25, this would leave punctuation, symbols like spaces and hashtags, capitalization, and numbers out. 
Since the early 1960’s the American National Standards Institute has maintained a coding system for the electronic 
transmission of documents in English called ASCII (pronounced ass-kee) or US-ASCIL Part of that system, largely 
developed by Bob Bemer [18], is a numeric representation of all the symbols you are likely to find on an English 
language keyboard. See Table 36. For example, the capital letter “A” has numeric code 65, the lower case letter “‘a” 
has numeric code 97, and the space has numeric code 32. 


=] 3: 2 
Using the coding matrix | —4 1 2 |, the message “Hello World!” would be encrypted as follows. 
-l 0 1 


1. “Hello World!” is converted to the numeric sequence 72 101 108 108 111 32 87 111 114 108 100 33 using 
ASCII. 


2. Since we are using a 3 x 3 coding matrix, the numeric sequence is grouped three at a time into the 3-row matrix 


101 111 111 100 


72 108 87 108 
108 32 114 33 


If the message did not have a multiple of three characters, 0’s (null characters) could be added to the end. 


3. The message matrix is multiplied by the coding matrix. 


=7. 32 72 108 87 108 15-359 -48 —390 
-4 1 2 101 111 111 100 |=] 29 -257 -9 —266 
-!1 0 1 108 32 114 33 36 -76 27 -75 


This is a good place to use a calculator or SageMath! )) SageMath Cell ae) 


4. The coded message is extracted from the product: 


15 29 36 -359 -257 -76 -48 -9 27 -390 -266 -75 


The message at the beginning of this section was encoded with the same matrix. Can you decode it (using a calculator 
or SageMath to assist)? Answer on page 40. 


Crumpet 11: Lester S. Hill 


The first documented multiple-letter cipher is attributed to Lester S. Hill. His Mathematical Monthly article of 1929 
[12] outlines a procedure very similar to the one presented here except modular arithmetic is used to make sure all 
numbers in the encoded message are valid character codes. Thus the encoded message is transmitted as a sequence of 
letters and symbols, not numbers. His work far predates electronic computing devices so, to be practical, he needed 
a way to limit the difficulty of doing the computations, a second impetus for using modular arithmetic. 


Key Concepts 
A7! The inverse of A. Can be computed via (1.6.2). 
adjugate For a square matrix, the transpose of its matrix of cofactors. 


identity matrix A matrix with ones on the main diagonal and zeros elsewhere. 
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matrix inverse Matrices A and B are inverses of one another if and only if AB = BA = J. 
invertible (matrix) A matrix whose inverse is defined. 


main diagonal The i,i-entries of a matrix. 


SageMath 
If Mis a matrix in SageMath, then M.inverse() is its inverse. The following code computes the inverses of A = 
a VS -n 18 
andB=| % 34 W19 
-1 3 1 2 8 A 7 
3 _~9 3 5 4 0.3 e 


A = matrix(4,4, [4,-9,-3,-4,-1,1,0,1,-1,3,1,2,3,-9,-3,-5]) 
print(A.inverse(Q)); print() 
B=matrix(3,3,[sqrt(5),-pi,18,17/8,34,194(1/5) ,pi/4,0.34,e47]) 
print (B.inverse()) 


QEZSEISD 33 The output for AW! is 


[1 0 -3 -2] 
[1 1 -6 -3] 
[-2 -3 10 5] 
[® 0 3 1] 


but the output for B“! is far too long to fit on the page. Just the 1,1-entry is 1/5( V5n - V5( V5r2 + 6.8)/( V5n + 
80))( V52(153 V5 — 20 - 191/5))/(-V5m + 80) — 153 ¥5)/(153 V5m — (-V5z? + 6.8)(153 V5 — 20- 191/5)/( V5 + 80) — 
170e7) + 1/5 V5 — 2/( V5z + 80). 


Exercises | 0 -3 2 | 
agj}s5 1 2 
1. Compute the inverse if possible. 5 -8 8 
@ | -3 | 6 3 0 
(b) [ 0 | (m) | -1 -1 6 | [S]-285 
0 0 7 
() | % | [s}285 
3-2 -6 
(d) [ 4m | [A}-348 (n) | -1 1 3 | [A}348 
3-2 -4 3 10 
©} ay -7 
G a7 «2 <1] 
) | vi2 3 | ()| 0 12 2 12 
2 vB -1 -9 6 10 
I 
(g) a - | [A]-348 -4 -9 0 
t <6 =] 
— | i dt 9 | ae 
*) | -5 4 [S]-285 3 8 10 
.f 2-3 wv 2 1 6 2 
@ 12 V28 5 | ti a ae eee 
a. (Ve cp 7g ce | TE 
(j) | 7 8 | 3 0 -Il 9 
et 2 20 0 =-!1 
a a) Oi =e 3 
(ky) | 0 2 0 O11 02 0 
7 8 1 3 1 7 +0 
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4 7 0 8 (d) 
oe 0 -2 -4 -1 0 
0 2 1 -! 0 4 7 2 =«0 
0 0 2 -2 4 4 10 0 1 
2. Compare and contrast the inverse of a 1 x 1 matrix with 0 -1 -2 0 0 
the multiplicative inverse of a real number. -lo1 2 1 0 
3. True or false? If all the entries in a square matrix M are 7. Find x and y so that A and B are inverses. 
integers and det M = 1, then all the entries in M~' are 
integers. 1 1 -5 16 11 9 
4. Explain how the determinant can help determine whether an ; . 4 ae ‘i 2 7 
a matrix has an inverse. 
5. Suppose B is an invertible 3 x 3 matrix and HINT: You do not need to calculate A~'. Use the fact 
that whenever A and B are inverses, AB = BA = I. 
14 —-70 80 64.9 
-29 95 |-B=| -62 -52 |. 7 = 2 
-12 -43 32 52 For the remaining exercises, let A = | -3 3 1 and 
3 2 1 
80 64.9 1 2 2 
Find] -62 -52 |-B!. [S]-286 B=| 2 8 #7 
-32 52 =3. =) =) 
6. Which matrix is the inverse of 
8. 34 Compute 
0 0 4 0 -!1 ‘ 
24 4 -1 | (a) A“(AB) 
-4 7 10 -2 2 |? (b) (AB)A™! 
-l 2 0 0 1 1 
BU (BA 
0 0 1 0 0 ee 
(d) (BA)B"! 
(a) ; 
=[ 4 =) =f 8 Did you get what you expected? 
0 2 -1 0 2 
0 0 0 0 1 9. 35 Compute 
1 -l 0 2 O (a) (AB)! 
-l 0 0 0O 4 (b) A-!B-! 
(b) (c) BA" 
-l 4 2 -1 
0 2 -1 O What do you notice? [S]-286 
p G 2 Q SageMath Cell 
1 -1 O 2 10. Decode the message -589 861 339 - 
-1 0 0 4 602 958 317 -244 224 180 -546 768 325 33 -99 0. It was 
encoded using 
(c) 
4 -1 -2 -1 8 1 -4 -2 
2 0 -l 0O 2 -3 «7 3 
0 O O O 1 0 2 1 
1 -l 0 2 O 
-1 0 4 0 0 [S]-287 
Answers 


matrix products part 1: 


3 4 -1 of-l 3. 4. 72 
17 -21 55 jt Lae et. 5 
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100 0]{ V5 -r 18 32 V5 -t 18 32 
010 0}; 2 34 Vi9 m9 |_| 2 34 VI9 Ind 
0010 t 0.34 e7 sin(l) | | 2% 0.34 — e7_— sin(1) 
0 0 0 14] 2” tan) 12 — 101000 2” tan(1) 12 — 101000 
matrix products part 2: 
Pal) eo Se Lo ae 
SS ae | at 
Bess, 3 i, tee tag 10 0 
Bo ed SO ee Be TGS et coe a0 
<1. o 5 -ll -36 001 
f 20s 6. ep on 20 8 4 100 0 
i We S65 Ss ets A, Oe see s0e ar “0-20 
953° 10-30 |) el. 3 TOO 1-0 
0 0 3 1 3 -9 -3 -5 0001 


inverse undoes multiplication: 

(1) 
-11 16 5 3 3 4 
((AB)B"!),., = (-7)(7) + (10)(5) = -49 +50 = 1 
((AB)B"!),.9 = (-7)(4) + (10)(3) = —28 + 30 = 2 
((AB)B™')3, = (-11)(7) + (16)(5) = -77 + 80 = 3 
((AB)B™')3,9 = (-11)(4) + (16)(3) = -444+ 48 = 4 


BcaB)=| f || -7 al ee lee 


wow’ SUL3 SL 2 


(ii) 
5 3 || -11 16 -68 98 
(B-'(AB)),.1 = (7)(-7) + (4)(-11) = —49 — 44 = -93 
(B-'(AB)),.2 = (7)(10) + (4)(16) = 70 + 64 = 134 
(B"'(AB))2,1 = (5\(-7) + )(-11) = -35 — 33 = -68 
(B~'(AB))2 = (5)(10) + (3)(16) = 50 + 48 = 98 


-7 3 2 
decoding: The coding matrix C =| —4 1 2 | has determinant one: 
-1 0 1 
1.2 -4 2 -4 1 
aetC =-1] 4 i |-3] 2 I +2 al i 
= —7(1) — 3(-4+ 2) + 200 + 1) 
=-7+6+2 


=1 


The following are cofactors, not entries: 


1 2 -4 2 =4. 1 
3 2 =a 22 ay 3 
cus] 5 (|= ca=| 7 i |=-s Ga=—1] 7] 3-3 
32 = ee ae 
cue] it |-4 C32 = ile, 3 |=6 Cas] i Ee 
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so 


Decoding is therefore done by multiplying 


2 -5 6 -78 -145 -294 -127 -84 -174 272 -59 -132 
L -=3.. 3 14 -13 -49 -7 -1 -10 103 16 —33 


E 3 s|| —273, -572 -245 -150 -389 412 -142 “1 | 
| 110 114 108 98 93 8 99 ° | 


76 «#101 32 «103 114 32 82 107 O 
105 97 65 101 97 #83 I11 115 O 


QEZSELED 36 and the numeric message is 91 76 105 110 101 97 114 32 65 108 103 101 98 114 97 93 32 
83 8 82 111 99 107 115 33 0 0. The last step is to look these numbers up in the ASCII table. 
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1.7 Eigenpairs 


—2 


1 oi and compute each product before reading on. 


Let A -| 


> 
> — 
— 
| 
— ee & 
— 
——— 
> 
—_— > 
ee 
| 
NINH Wily | 
Wwloo 
ea 
—) 


= _8 
You should find that | a |-| a fal 3 |-| i || . Put another 


way, 


Put yet another way, 
2 2 
Av; = -3v1 a(-2u] =-3 (-2"] 


where vj = and v2 = , making the relationship between the matrix A and the vectors more apparent. The 


4 1 
—1 1 | 
product of A with each of these vectors gives a scalar multiple of the vector. That’s unusual, and a quick experiment 
will illustrate. Try this: 


1. Write down a 2 x 2 matrix M with four random nonzero entries (from —10 to 10, say). 
2. Write down a 2 x | vector u with two random nonzero entries. 
3. Compute Mu. 
You will almost certainly find that Mu is not a multiple of u. Using the example matrix A from above as a random 


: -1 
matrix M and the vector | as a random vector u, 


aa tele allele 


There is no number 2 such that A =A ;. demonstrating that Mu is no scalar multiple of u. Multiplying a 
vector by A does not always produce a multiple of the vector. 
rete 2 Slee AM el So eae! eal |e lue of 2. S 
etting B=}, _, | Bi =|] _) _» > _3 | but) _, 3 | for any value of 2. So vi 


is not some magical vector such that multiplying it by any matrix gives a multiple. That kind of vector does not exist. 

From the evidence A is not special on its own, nor are v; or V2 special on their own. A and vy, are only special 
together just as A and v2 are only special together. To indicate the special relationship between A and v; (Av, = Av, 
for some A), we call v, an eigenvector of A. Similarly, v2 is an eigenvector of A. But that doesn’t tell the whole 
story. AV; = Av; and Av2 = Avz for different values of A. The value —3 is associated with the eigenvector v, and 
the value 2 is associated with the eigenvector v2. To mark this relationship, we call —3 an eigenvalue of A associated 


with eigenvector and we call 2 an eigenvalue of A associated with the eigenvector . Any eigenvalue 


2 1 
together with an associated eigenvector is called an eigenpair. For each eigenvector there is an eigenvalue, and for 
each eigenvalue there is an eigenvector (or is there?). 
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It is true, for any matrix A and any scalar A, that AO = 20 where 0 is the proper size vector with a zero for each 
entry, a so-called zero vector. Given the truth of this statement for all matrices, it does not tell us anything useful 
about a matrix. Moreover, if we allow 0 to be an eigenvector, the eigenvalue associated with 0 would be ill-defined. 
It could be any number! Therefore we disallow 0 from the definition of eigenvector. With this restriction, every 
eigenvector has a unique associated eigenvalue. 

Given an eigenvector v of a matrix M, it is easy to calculate the associated eigenvalue. Given an eigenvalue of a 
matrix, finding an associated eigenvector takes some work. Suppose 2 is an eigenvalue of the matrix M. By definition, 
the associated eigenvector v satisfies Mv = Av. Equivalently, 


(M — ADv = 0. (1.7.1) 


We will see why these equations are equivalent later on. The equivalent form (1.7.1) gives us a way to find eigen- 
vectors. Each side of the equation expresses a vector, and for two vectors to be equal, their corresponding entries 
must be equal. For a vector v with n entries, this observation yields n linear equations in the n unknown entries of 
v. Recalling how to solve linear systems of equations gives a solution for v. To illustrate, we find an eigenvector of 
-31 14 10 
-76 35 26 | associated with the eigenvalue —2. Starting with (1.7.1), we have 


18 -9 -8 
-31 14 10 —2 0 O 0 
-76 35 26 |-| O —2 O v=| 0 
18 -9 -8 0) QO -2 0 
or 
-—29 14 10 Vv} 0 
-76 37 26 v2 |=] O |. 
18 -9 -6 V3 0 
Multiplying yields 
—29y, + 14v2 + 103 0 
—76v, + 372 + 263 | =| 0 
18y _ 9v> aad 6v3 0 
so we must have 
—29v, + 14v2 + 10v3 = 0 
—-76v, + 37v. + 2673 = O (1.7.2) 
18y, - 9vw - 673 = O 


Can you find a single set of values for the variables v,, v2, v3 that solves all three equations? Answer on page 46. One 
solution is vj = vz = 2 and v3 = 3. We can verify that 


Vj 2 
v=| v2 |=] 2 | is indeed an eigenvector 
V3 3 


-31 14 #10 2 —4 2 
-76 35 26 2 {=| -4 }=-2 ; 
18 -9 -8 3 —-6 3 


Note that if v is an eigenvector of A associated with value A, so is cv. Therefore, solving for an eigenvector will 
always yield an infinite number of solutions. There is no unique eigenvector associated with a given eigenvalue. We 
will prove these facts later. 

Now we can find an eigenvector of a matrix M given an eigenvalue, and we can find an eigenvalue of a matrix 
M given an eigenvector, but what if we have neither an eigenvalue nor eigenvector of M? Returning to (1.7.1), we 
know (M — ADv = 0. Certainly if v = 0, the equation is true. But we have decided that 0 is excluded from being 
an eigenvector, so we seek solutions where v # 0. Suppose det(M — Al) # 0, meaning (M — AJ) is invertible. Then 
left-multiplying both sides of (1.7.1) yields 


by multiplying: 


(M — al)! (M - ADy) = (M - AD7'0 = 0. 
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But multiplication by a matrix’s inverse undoes multiplication by that matrix, so (M — AI)~! ((M — ADv) = v. Now 
we have that (M — al)~! ((M — ADv) = 0 and (M — AD7! ((M — ADv) = v, so we conclude v = 0, which is disallowed 
as an eigenvector. No solutions come from letting det(M — AI) # 0, so it must be that det(M — Al) = 0. Since the 
determinant of a matrix is a scalar, this equation is a scalar equation (like those you have seen in algebra) in the single 
variable A. Solving that equation for A gives eigenvalues and knowing eigenvalues gives eigenvectors. 
Since 
det(M — AT) = 0 (1.7.3) 


is the linchpin in finding eigenvalues and eigenvectors of a matrix, it has a name—the characteristic equation (of 
M). The expression det(M — AJ) is ann degree polynomial in A and is called the characteristic polynomial (of M). 
Perhaps solving equations of the form polynomial = 0 reminds you of factoring. If you know that (x — 2)(x + 5) = 0, 
for example, then you know that either x — 2 = 0 or x + 5 = 0 giving two solutions, x = 2 and x = —5. The equation 
det(M — Al) = 0 takes exactly this form and, when contrived as such, will be factorable. 

Exercise 18 of section 3.7 requests an argument that det(M — AJ) = 0 if and only if J is an eigenvalue of M. 


Key Concepts 

characteristic equation det(M — AJ) = 0 for any matrix M. 

characteristic polynomial det(M/ — AJ) for any matrix M. 

eigenpair An eigenvalue together with an associated eigenvector. (A, v) is an eigenpair for matrix M if Mv = Av. 
eigenvalue A value J is an eigenvalue of the matrix M if there is a nonzero vector v such that Mv = Av. 
eigenvector A nonzero vector v is an eigenvector of a matrix M if there is a value 2 such that Mv = Av. 


zero vector Any vector with a zero for each entry. 


SageMath 


If Mis a matrix in SageMath, then M.eigenvectors_right () lists its eigenvalues and eigenvectors, M. eigenvalues () 
lists only its eigenvalues, and M. charpoly() gives its characteristic polynomial. The following code computes the 
eigenvalues, eigenvectors, and characteristic polynomial of 


-112 -21 -15 -372 
-84 -13 -19 -292 
-36 -13 -3 -116 
36 7 5 120 


A= 


M = matrix(4,4, [-112,-21,-15,-372,-84,-13,-19,-292, 
-36,-13,-3,-116,36,7,5,120]) 

print(M); printQ 

print (M.eigenvalues()); print() 

print (M.eigenvectors_rightQ)); print(Q 

print (M.charpolyQ) 


£3) Sagelath Cell] 37 The output of the code is 
[-112 -21 -15 -372] 
[ -84 -13 -19 -292] 
[ -36 -13  -3 -116] 
[ 36 7 5 120] 
[-4, -12, 4, 4] 


[4,1 
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(1, -3, -3, 0) 
I dy Catz 
(1, 2/3, 2/3, 
1,0), 4 [ 

(1, -1/3, 1, -1/3) 
], 2)] 


-1/3) 


X44 + 8*xA3 - 64*xA2 - 128*x + 768 


Since the characteristic polynomial of an n x n matrix has degree n it has n eigenvalues counting multiplicities and 
complex eigenvalues. The output of the eigenvalues() method above shows that 4 has multiplicity 2. It is listed 


twice in the list of eigenvalues ([-4, 


-12, 4, 4]). The eigenvectors_right() gives the same information 


and more. It prints out each eigenvalue, all associated eigenvectors, and finally the multiplicity of the eigenvalue. 
In the output above, the eigenvectors_right() method shows the multiplicity of the eigenvalue 4 by listing the 
eigenvalue 4 followed by its associated eigenvector (1, —1/3, 1,—1/3) and finally its multiplicity 2 (the last 3 lines of 


output before the characteristic polynomial): 


1,0, GO 
(1, 217/35, ty -<173) 
J, 2)] 


Exercises 


1. Use the fact that v is an eigenvector of A to find an eigen- 
value of A. 


8 6]. [ -2 
om[h Sh-[ 3 
5 


(b) A= 


eee 
NO 


7 jv=| 2 | [A]-348 


-4 1 1 0 
(c) A=] 2 0 -2 |rv=] 1 [S]-287 
ae a | =f 
24 -8 10 2 
(d) A= 0 6 0 I;v=] 0 
-45 18 -21 3 
-28 O 48 —48 
80 6 -186 182 
aa et ce ee Oe 
4 -1 -15 21 
2 
oe 3 
“| 2 
1 
-24 -8 11 27 
216 52 -130 -258 
a= 52 -8 29 57 : 
52 8 —33 -61 
-l1 
v= 4 [A]-348 


2. Find the characteristic polynomial. 


(a) | =i 2 | 


—2 -4 
} 7 12 | 
-12 12 
(c) 9 -9 | [A]-348 
-8 -3 
(d) 3 > | [S]-287 
1 0 2 
(e) | 4 5 1 [A]-348 
-6 -6 -4 
-12 9 6 
(f) | -15 12 6 | 
= 3 9 
9 -15 4 
(g) | 39-33 28 | 
33-15 28 
3. Find the eigenvalues. They may be complex. 
(a) E ee | 
(b) ae ae | [A]-348 
-9 4 
(c) -36 15 
-7 25 
(@) one | [S]-288 
Of a 
(f) | 1 15 7 [A]-348 
(g) eae | 
(h) ee 2 [A]-348 
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-1 10 6 
(i) 2 3 2. 
—2 -2 -1 
2 3 «-2 
(j) 12 -17 —-12 | [A]-348 
-15 21 15 
3 0 0 
(k) 4 13 -12 | [S]-288 
4 16 -15 
-4 3 3 0 
-16 1 -13 -24 
) -4 2 4 0 
6 -3 -3 2 
4. Find an eigenvector associated with the given eigenvalue. 
3 -10 
@ A=| 5 15 jaq-s 
(b) A -| a . fia = 2 [A]-348 
-4 2 
() A-| aes azo [S}-288 
1 6 
@ A=| 3 - ja=-24av9 
(e) A-| Bs as-is [A]-348 
-5 6 -12 
(ff) A=| 7 -8 16 |;A=-2 
5 -6 12 
9 1 -5 
(g) A=| 33 17) -25 |;A=6 
36 12 -24 
-18 6 6 
(h) A=} -19 5 9 J;A=0 [A]-348 
-11 5 1 


5. Describe a connection between eigenvalues and determi- 


nants. 


6. True or false? [A]-348 


(a) The only eigenvector corresponding to a zero 


eigenvalue is the zero vector. 


(b) An eigenvalue may be any complex number except 


Zero. 


Answers 


. Suppose matrix A is a 3X3 matrix such that A- 


(c) An eigenvector cannot be the zero vector. 
(d) All 2 x 2 matrices have two different eigenvalues. 


(e) Each eigenvalue has exactly one corresponding 
eigenvector. 


(f) An eigenvalue may be any number, including zero 
and complex numbers. 


(g) Ann Xn matrix can have n + | eigenvalues. 


7. True or false? If all the entries in a square matrix M are 


integers but one of its eigenvalues is V2, then the entries 
of the corresponding eigenvector cannot all be integers. 


20 
-16 |= 


8 


5 
| —4 |. Find an eigenvalue of A. 
2 


BR >) Sagertath coll Bay 


99 -135 -199 417 
ua| 30 36 61 123 
~| 90 -135 -174 369 


30 -45 -57 120 


and use SageMath to determine which of the following 
vectors is an eigenvector of M. Also determine its asso- 
ciated eigenvalue. [A]-348 


4 2 -l -1 4 
3 1 -l 2 2 
O}’} -6 |?| -6 |? | -6 |?] 1 
0 -3 5 —2 -l 


: £3) Sage ath Cell] 39 Use SageMath’s 


M.eigenvectors_right() 


method to compute the eigenvectors of M from question 
9. Notice that none of the vectors from that question is 
listed as an eigenvector by SageMath, yet one of them is. 
Can you resolve this conundrum? Is it possible that lin- 
ear combinations of eigenvectors are also eigenvectors? 


: ) Sagerath Cell Keita your work on question 2 using 


SageMath’s charpoly() method. [S]-289 


system solution: One possible solution of equations (1.7.2) follows. Start by dividing the third equation by 3, which 


yields 


—29v, 
-76v1 
6y1 


+ 26V3 


10v3 = 


ll 
oo 


2v3 = 0 
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The v3 variable can be eliminated from the first two equations by adding 5 times the third equation to the first 
and 13 times the third equation to the second: 


Vyo- Ww = 0 
2v1 = 2vo = 0 
6y, - 3% - 243 = O 


Dividing the second equation by 2, we see it is just a repeat of the first equation, vj — v2 = 0, which implies that 
Vv; = v2. Substituting into the third equation, we find 


62 — 3y — 213 =0 


or 3v. — 2v3 = 0, which means v3 = 30. This set of equations has infinitely many solutions! They take the 
form v) = vo, v3 = 30, and v2 is any number. In terms of the eigenvector, this observation means 


1 
v= v2 = v2 1 
3 
} 


for some value v2. If v2 = 0, however, then v = 0, which is disallowed as an eigenvector. Therefore every 
1 


nonzero scalar multiple of is an eigenvector of A associated with the eigenvalue —2. 


NIW ee 
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Row Operations 


2.1 Systems of Linear Equations 


People have been solving systems of linear equations for millenia, since long before the advent of what we know 
today as algebra. Recorded history of linear systems dates back to about 150 BCE in China! Even the modern 
process of elimination, often learned in high school algebra and precalculus classes, and demonstrated on page 46, 
dates back in geometric form to at least the second century CE (circa 100), appearing as a narrative in the Chinese 
treatise Nine Chapters on the Mathematical Arts. Roger Hart proposes that a geometric version of the modern day 
procedure of using determinants to solve systems is also evident in Nine Chapters [11]. 


Crumpet 12: Yanghui Triangle 


In China, the triangular arrangement of binomial coefficients, the first five rows of which are 


1 
11 
2 I 
133} Ih 
14641 


is often called the Yanghui triangle. It was devised in the 11” century CE by Jia Xian and popularized in the 13” 
century by Yang Hui. Omar Khayyam, an 11” century Persian figure also studied the triangle. [13] 


In our modern treatment of systems of linear equations, a linear equation is an equation that takes the form 
CyVy +. C2V2 +++ + CgVy =D (2.1.1) 


for coefficients c),C2,..., Cy, variables vj,v2,...,V¥, and constant b. The (ordered) list 51, 52,..., 5, is a solution of 
the equation if and only if substitution of the values 51, 52,..., 8, for the variables v|,v2,...,V,, respectively, make 
the equation true. A set of linear equations is called a linear system or system of linear equations. The (ordered) 
list 51, 52,..., Sy is a Solution of a linear system if and only if it is a solution of every equation in the system. The 
set of equations (1.7.2) on page 43 is an example of a system of linear equations, and the lists 1, 1, 3 and 2, 2,3 are 
solutions of the system (given that these lists specify values of the variables v,, v2, v3 in that order). 

Calculations like the one on page 46 easily submit to the conciseness of matrices. If you are familiar with synthetic 
division, you are already familiar with this idea. Only coefficients and constants are retained. All variables and other 
“extraneous” symbols are unused. On the left is the original calculation from page 46. On the right is an accounting 
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of the coefficients and constants of each equation during each step of the process, maintained in a 3 x 4 matrix. 
The first column holds the v; coefficients, the second column holds the v2 coefficients, the third column holds the v3 


coefficients, and the fourth column holds the constants from the righthand sides of the equations. 


The given system: 


—29v, + 14 + 103 = O 
-76v, + 37v. + 2673 = O (2.1.2) 
6y1 = 32 aed 23 = 0 


Adding 5 times the third equation to the first and 13 
times the third equation to the second: 


Vo- (|W = 0 
2vy, - 2v = 0 
6, = 3V = 2v3 = 0 


Dividing the second equation by 2, we see it is just a 
repeat of the first equation, vj — v2 = 0, so we can 
scrap it: 


Vyo- (WwW = 0 
6y, - 3% - 273 = O 


Substituting v; = v2 into the equation 


As a matrix: 
-29 14 10 O 
-76 37 26 O 
6 -3 -—2 0 


Adding 5 times the third row to the first and 13 times 
the third row to the second: 


1 -l 0 0O 
2-2 0 0O 


6 -3 -2 0 


Dividing the second row by 2, we see it is just a 
repeat of the first row, 1 — 1 0 0, so we can zero it out 
and swap it with the third row: 


1 -l 0 0 
6 -3 -2 0 
0 0 0 0 


Subtracting 6 times the first row from the second: 


6v1 = 3y2 = 2V3 = 0: 


3y7 — 2V3 =0 


In either case, we have reduced the system to the two equations v; — v2 = 0 and 3v2 —2v3 = 0, from which the solution 
follows. 

Much like successful synthetic division is dependent on strict ordering of the coefficients of the polynomial, it 
should be noted that the success of the matrix process is dependent on strict ordering of the entries of the matrix. 
Each row of the matrix represents one equation. The rightmost column of the matrix represents the constants from 
the righthand sides of the equations. Each of the remaining columns represents the coefficients of a single variable 
from the lefthand sides of the equations. It can easily be verified that the systems 


—29v, + 142 + 10v3 = 0 14y2 + 10v3 = 29v1 
-76v, + 37v. + 2673 = O and 26¥3+37v. = T6v, 
6y1 - 3% - 24 = O 6y, = 23+ 32 


are equivalent. Each equation of the system on the left has simply been rewritten using positive coefficients in the 
system on the right. However, the matrix representation of the system on the right, which might be written as 


14 10 29 0 
26 37 76 O 
0 6 2 3 


is not helpful in solving the system using row operations. Certainly we can subtract 38 times the third row from the 
second row: 


[ 26 37 76 0 |-38| 0 6 2 3: =| 26 -191 0 -114 | 


to obtain the matrix 


26 -191 0 -114 


14 10 29 0 
0 6 2 3 
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creating a zero in the 2,3-entry just as before. The problem is the —191 and 0 of the resultant matrix have no meaning 
in terms of the original system. Those parts of the calculation represent 37v2—38(6v;), which simplifies to 37v2—228v, 
and not necessarily —191 times anything; and 76v,; — 38(2v3), which simplifies to 76v; — 76v3 or 76(v; — v3) and not 
necessarily 0 times anything. For row operations to make sense in the context of solving systems of equations, the 
entries in a single column must all be coefficients of the same variable or constants from the equations of the system. 
By convention, the rightmost column always holds the constants and the remainder of the columns represent the 
coefficients of one variable each. The order of the columns of coefficients is flexible as long as the order is known. 

Besides noting that the order of the entries in the matrix is critical, what should be taken from the matrix solution 
of (2.1.2) is the fact that three row operations are enough to mirror the process of elimination using matrices. On the 
matrix side, we swapped rows, multiplied rows by a nonzero constant, and added multiples of one row to another. 
Everything done in solving a linear system can be modeled by one of these operations on the corresponding matrix. 
As such, these three operations are called elementary row operations. To summarize, name them, and provide 
shorthand notation, the elementary row operations are 


Swap: swap two rows. 


Shorthand for swapping rows j and k of matrix M: Mj. @ M,, 


Scale: multiply each entry in a row by a nonzero scalar. 


Shorthand for scaling row j of matrix M by s: sM;; > M;; 


Replace: add a multiple of one row to another. 

Shorthand for adding s times row j to row k of matrix M: sM;.+ M;,. > Mx; 
Any system of linear equations can be translated into a matrix and solved using these three operations. Even systems 
with no solution reveal themselves as unsolvable under the direction of these three operations. If that were the only 


purpose of row operations, they would be useful, but as the concepts of linear algebra unfold, the ideas laid out here 
will have many further reaching consequences. 


Elementary Matrices 


Any matrix resulting from performing an elementary row operation on an identity matrix is called an elementary 
matrix. For example 


oro 
oor 
- OO 
ee) 


0 0 0 


is an elementary matrix since it is just J4,4 with the first two rows swapped. The feature of interest is that left- 
multiplying any other matrix by this elementary matrix has the effect of performing the corresponding row operation 
(swapping the first two rows) on that arbitrary matrix so long as the product is defined. For example, if we let 
0 -12 4 -10 10 
6 2 -8 -2 5 
A=] _ 5 4 6 9 7} then 
1 -l 8 -9 3 


010 017 0 -12 4 -10 10 6 & <a <e -3 
1000/6 2 -8— 2 5 0 -12 4 -10 10 

Fee) GQ Ol <s <A a6: FPS Sa ce. oO OF eh) 
oooill1 -1 8 -9 3 1 -1 8 -9 3 


Take a short break to verify at least one of the entries in each row. This exercise will help you see why the product 
works out as shown and will help illuminate the following computations. 

The 1,j-entry of EA is a linear combination of the entries in the j” column of A. To be precise, (EA)), j= 
OA1,; + 1A2,; + 0A3,; + OAy ;. The entries from the first row of E are used as the coefficients of the linear combination 
needed to compute each entry of the first row of EA. It follows that the first row of EA is a linear combination of the 


rows of A using these same coefficients. Symbolically, (EA);,, = 0A1,, + 1Az2,, + 0A3 + OA4.. In summary, each row 
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of EA is the linear combination of the rows of A with coefficients from the corresponding row of E. To compute the 
third row of EA, for example, we use the entries from the third row of E as coefficients of a linear combination of the 
rows of A: 


(EA)3,, =0 
0 


All the rows of EA can be computed quickly from this perspective. The first row of E, [ 0 1 0 0 | suggests that 
the first row of EA is 0 times row one of A plus | times row 2 of A plus 0 times row 3 of A plus 0 times row 4 of A. 
Looking at it this way makes it clear that the first row of EA is simply the second row of A. Similarly, the second row 
of EA is just the first row of A, the third row of EA is the third row of A, and the fourth row of FA is the fourth row 
of A. In other words, multiplying A by E has the effect of swapping the first two rows of A as claimed. The effect of 
multiplying by other elementary matrices can be verified similarly. 

Key Concepts 

elementary matrix The matrix resulting from performing a single row operation on an identity matrix. 


elementary row operation One of swap, scale, or replace. 


linear equation An equation of the form c)vj +c2v2 +-+-+CnV_) = b where C1, C2,..., Cn are coefficients, vj, v2,..., Vp 
are variables, and b is a constant. 


linear system Another name for a system of linear equations. 

row swap Swapping two rows of a matrix. 

row scale Multiplying each entry of a single row of a matrix by a nonzero scalar. 

row replace Replacing a row of a matrix with the sum of it plus some multiple of another row. 

system of linear equations A set of linear equations. 

solution of a linear system An (ordered) list of values 5, 52,..., 8, that is a solution of every equation in the system. 


solution of a linear equation An (ordered) list of values 51, 52,..., 8, Such that substitution for the variables v,, v2,..., Vy, 
respectively, in the linear equation c,v + C2v2 + +++ + CnV,_, = b make the equation true. 


Exercises 3. + 2m + 7 = 3y 
(d) 8 + vw = Sy + 23 [A]- 
1. Represent the linear system as a matrix. vyo+ wh h6UthlURlCU= lO 
348 
-x - 4y + 8 = O 
(a) 6x + Ty - 8 = 6 2. Write the linear system associated with the matrix. As- 
-5x - Ty + 92 = 9 sume the variables of the system are v1, V2,..., Vn, their 
coefficients appear in that order in the matrix, and the 
3x + 2 - 8 = 9 rightmost column holds the constants. 
(b) - 3y + 2z = 10 [A]-348 
-Ix + = -ll 5 -7 8 
: @) | 8 6 2 | 
2x, - 8 + x%4 = O 


-13 2 -1 13 
15 -9 -6 12 


i 3 3 we SS ae eo eae: 
(b) [A]-348 
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6 0 -ll 13° AG he et 12 -13 16 
(yl 4 2 «1 @) Bb) 15 -8 2 -7 |=} -60 32 <8 
—§ 15 1 =§. «<f <3 oi 5 4 3 


10 -9 -1 3 15 6 


| 12 13 5 -4 6, | ws 


(b) E- 


3. The matrix for a linear system is given. Find one solu- 


tion of the associated system. Assume the variables of 12 18 -2 12 18 -2 
the system are vj, V2,...v, and their coefficients appear (c) E-| -10 3 6 |=} —-10 3 6 
in that order in the matrix. I 3 3 -30 -10 -6 
100 6 4 -5 4 —5 
(a)|0 3 0 8 (d) E-} 2 9 |=] -18 -16 
0 0 1 -2 4 5 4 5 
11 00 0 9 -4 -14 13 14 
0 5 0 =O =7 | -19 7 _| —19 vi 
©] 9 9 1 o ~13 | S)289 () EF) 59 17 |=] -20 17 
0 0 0 -2 6 13 14 -4 -14 
9 -9 -2 8 9 -7 23 36 23 
(c) 03 4 (f) E- 5 9 10}; _} 5 9 10 
9 3 -41° ] 9 3 -4 
1 0 9 12 A. <j 4 =<4 <1 4 
(d) | O -8 -4 -1 | [S]-289 
0 O t 2 8. Perform the row operation on the matrix. 
4. A matrix representing a linear system is given. Explain 3 -8 -9 
why the system has no solution. (a) A= rd . o patAL, =e Ae. 
[. 2. 3 | -l 1 -4 
0 0 2 
(b) B= a ae ; 5Bo,, > Bo, [A]-348 
: : ; Poe : 1 -2 6 -4 
5. A matrix representing a linear system is given. Find one 
solution of the system, and explain why the system has -6 2 
infinitely many more solutions. [A]-348 ‘c= : : nits ea 
1 02 -5 1 -9 
0 13 4 
000 0 a ee 
(4d) D=| 5 6 -—2 |; D,, @ D3, [A]-348 
6. What elementary row operation will transform the first 9 9 -4 
matrix into the second? -8 5 
1 —4 4 ~9 1 -4 4 ~9 (e) E= -7 -9 |; 3E>. + E3. => E3. 
| i 4 oe fi tf 6 -5 1 | oe 
—2 #7 1 1 0 19 -9 3 -5 1 9 -6 
=3 3 3 a] 223 3 3 7 (f) Fo= 3 -5 1 -1 A -5F\, Sd F >. ~~ 
()| 4 -3 3 6 |} -9 -6 9 -1 ] [AF “8 3 2 2 
9 6 9 || 4 3 3 6 Foe aes 
348 9. The matrix T is given. What elementary row operation 


does left-multiplication by T perform? 


22 9. 0 
ar 10 | 
0 01 


=) 3 2). 77 9 -9 7 -9 
(c)| 2 -6 3 -5 ],} 2 -6 3. -5 


-9 -9 7 -9 =) 3 =2. 7 


5 4.8 2 5 -4 8 2 
(d)|9 3 -2 -2]|,)0 48 79 16 


[A]- 


1 0 0 0 
i =. tea SD, 1 -5 -9 -2 es 010 0 
0 0 0 1 
2 6 -1 -8 2 6 -1 -8 0010 
(e) | -l 1 -6 4 j,J -2 2 -12 8 
2 -5 -6 6 2 -5 -6 6 oe 
(c) | O 1 O | [S]-289 
7. What elementary row matrix E makes the equation true? 04 1 
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0 0 1 

(d) | 0 1 O | [A]-348 
1 0 0 
1 0 0 0 
0 1 0 0 

(e) 003 0 [A]-348 
0 0 0 1 


1 0 O 
(f)} O 1 0 
-5 0 1 


10. Carry out a single elementary row operation to secure a 
1 in row 1, column 1. 


10 1° 7 
« | 3-3 | 
5 6 -4 
(b) oe | [A}-349 


=I) 2:3 
©) -2 10 -3 | 
11. Execute a single elementary row operation to secure a 0 
in row 2, column 1. 


ae) 


10 -3 

(b) ae | [A]-349 
1 -2 -3 

© | -2 10 -3 | 


12. Find the first row of the product by computing an ap- 
propriate linear combination of the rows of the righthand 
matrix. 


7 -8 ][ 3 0 
@| 7) ale 4 


=2 4 
{5 3 4 | 


13. If the third row of A contains all zeros, what can you say 
about the third row of AB? Assume the product AB is 
defined. 


14. If the first and second rows of A are multiples of one an- 
other, what can you say about the first and second rows 
of AB? Assume the product AB is defined. [A]-349 


2.2. ROW REDUCTION 55 


2.2. Row Reduction 


The system associated with a matrix of the form 
1 0 4 
0 1 5 


has only one solution: 4,5, and we can see that without doing any computation. The system associated with this 
matrix, something you can probably visualize mentally, looks like 


y=5 
(though you may have imagined different variables). The system is solved! The system corresponding to the matrix 


3 -2 2 
5 -3 5 


corresponds to a system that is not solved. Can you write down the associated system to verify? Answer on page 
61. Row reduction is the process of using elementary row operations to transform a matrix whose associated linear 
system is not solved into one whose associated linear system is solved, thus solving the system. 

The following algorithm describes the process of row reduction for solving a system of n equations in n unknowns, 
starting with a matrix representation of the system. 


Step 1: Select the leftmost column with at least one nonzero entry. This is a pivot column. The topmost position 
(row and column) in this column is a pivot position. If no such column exists or there are no rows below the 
pivot position, continue to step 5. 


Step 2: If the entry in the pivot position is 0, swap rows so the entry in the pivot position is nonzero. This nonzero 
entry is a pivot. 


Step 3: Replace rows until all entries in the pivot column below the pivot are 0. 
Step 4: Take the submatrix consisting of all rows below the pivot and return to step 1. 


Step 5: Select the column of the rightmost pivot position and replace rows until all entries in that column other than 
the pivot are 0. 


Step 6: Scale the row with the pivot so the pivot in that row is 1. 


Step 7: If there are no rows above the pivot, stop. Otherwise, take the submatrix consisting of all rows above the 
pivot and return to step 5. 


Translating the resulting matrix back into a system of equations reveals the solution. Any matrix produced by the 
completion of steps | through 4 is said to be in row echelon form. The system corresponding to a matrix in row eche- 
lon form can be solved by back-substitution, so these steps of the algorithm are often sufficient. Any matrix produced 
by the completion of the entire algorithm is said to be in reduced row echelon form. The system corresponding to a 
matrix in reduced row echelon form is solved. 

Even though the algorithm may sound rather rigid, there are choices to be made along the way. All choices will 
lead to the same solution, but different choices may lead to drastically different-looking routes. In the end, all choices 
in row reduction by hand are a matter of preference. 


Crumpet 13: Automated Row Reduction 


A computer programmed to perform row reduction will have to make choices just as a human working by hand would. 
However, the objectives of a computer are slightly different from the objectives of a human. A human is looking to 
make the computation as easy as possible while a computer should be programmed to make the computation as 
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accurate as possible. For a computer, working with fractions or decimal values is just as easy as working with 
integers, and doing a couple “extra” row operations is usually preferable to doing a lengthy analysis of how to avoid 
them. But computer computations are subject to round-off error, something that should be minimized whenever 
possible. Swapping rows so the pivot is the entry of greatest magnitude in the column helps reduce round-off error. 
The following is essentially computer pseudo-code that reduces a matrix to row echelon form. 

Step 1: Translate the system into a matrix A. 

Step 2: Leti=k=1. 


Step 3: Swap row i with row j where j > i and |A;,| > |A,,,| for all m > 7. If no such row exists, increment k by one 
and try again as long as k < n. If k reaches n + 1, stop. 


Step 4: Scale row i by 7-. 


Step 5: For each j from i+ 1 through n, replace row j by a8 times row 7 plus row j. 


Step 6: Increment i and k each by one and return to step 3 as long as k < n. 


The resulting matrix must be returned to the form of a linear system and solved by back-substitution to complete the 
solution. 


To illustrate some of the choices that must be made, consider solving the system 


3x) - % - 2x3 = 5 
2x, + 4x. + 8x3 = -13 
xX + 2x% + 33 = -4 


by row reduction. The following discussion charts the progress of three different approaches—(1) using fractions, 
(2) avoiding fractions by scaling non-pivot rows, and (3) avoiding fractions by scaling and swapping. All three 
approaches start from the matrix 


3 -l -2 5 
A=|}2 4 8 -13 
12 3 -4 
of the system. 
(1) Using fractions (2) Avoiding fractions by scaling @) sohuanen! pacuous/by scale 
and swapping 


Step 1: The first column is a pivot column. The pivot position is the topmost entry of the pivot column. 
There is nothing to do besides note this fact. Step 2: The pivot position must not contain a zero. This is 
already the case in all three approaches, but in approach (3) the first and third rows are swapped to 
secure a | as a pivot. 


A; oOo A3 
1 2 3 -4 
2 4 8 -13 
3-7 -2. 5 


Step 3: Replace rows to secure zeros in all rows below the pivot. Multiples of the row with the pivot 
are added to the rows below. In approach (1) the first row is scaled by i, and in approach (2) the 
second and third rows are scaled by 3 to prepare for replacement. 


5Ai. > Al, 3A. > Ap, and 3A3; > A3; 
2 
bays, | 3 -1 -2 5 
2 4 8 =13 6 12 24 -39 
i 2 = =4 3 6 9 -12 
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—2A\. + Ad. -= Ad, and 


—A|:: + A3. > A3.: 
1 2 5 
ae. Ge 1s 
a a 
o £ Hf _f 
3 3 3 


—2A\. + Ap. => A: and 


—A|: + A3.. > A3.: 
3 -1 —2 5 
0 14 28 -49 
O 7 11 -17 


—2A\. + Ao,: => Ad, and 
—3A\. + A3: => A3.: 


Step 4: Take the submatrix consisting of all rows below the pivot and return to step 1. The first row of 


the matrix is fixed until step 5. 


go 4 2B _49 
a 3 3 
o £ 2 _f 
3 3 3 


0 14 28 
0 7 Ii 


49 
217 


0 0 2  —-5 
0 -7 -ll 17 


Step 1: The second column is a pivot column. The pivot position is the topmost entry of the pivot 
column. There is nothing to do besides note this fact. Step 2: The pivot position must not contain a 
zero. This is already the case in approaches (1) and (2), but in approach (3) the rows must be swapped. 


A; ad A3:: 
0 -7 -11 17 
0 O 2 -5 


Step 3: Replace rows to secure zeros in all rows below the pivot. Multiples of the row with the pivot 
are added to the rows below. In approach (1) the first row is scaled by $3 in approach (2) the second 
row is scaled by 2; and in approach (3) this step is already done. 


5A\.. —> A; 2A2, — Ay 
oa a cB] tee 28 a 
1 
O24 AL se 0 14 22 -34 
—A. + Ad. > Ap: —A|: + Ap. > Ad. 
07 2 -8 0 14 28 -49 
00-1 3 0 0 -6 15 


Step 4: Take the submatrix consisting of all rows below the pivot and return to step 1. The first row of 
the submatrix (second row of the original matrix) is fixed until step 5. 


[o 0 -1 3 | 


[0 0 -6 15 | 


[o 0 2 -5 | 


Step 1: The third column is a pivot 


column. The pivot position is the topmost entry of the pivot 


column. There is nothing to do besides note this fact. Since there are no entries below the pivot, we 


move to step 5. At this point, our matrices are as follows. These matrices 


are all in row echelon form. 


1 =f -3 3 

49 

fog ee 
0 0 -1 3 


3 -1 —2 5 
0 14 28 -49 


0 0 -6 15 


1 2 3 
0 -7 -ll 17 


0 0 2. 


I 
4K 
——— 


Step 5: Select the column of the rightmost pivot position and replace rows until all entries in that 
column other than the pivot are 0. The pivots are A;1, Ao, and A33, so the pivot postions are in 
columns 1, 2, and 3, and we select the third column. To prepare for the row replacements in approach 
(1) there is nothing to do; in approach (2) the third row is scaled by i; and in approach (3) the first and 


second rows are scaled by 2. 
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TA, > Aa, 2A;, > Aj, and 2A>. > Ap. 
3. oh, SOS 2 4 6 -8 
0 14 28 -49 0 -14 -22 34 
0 O -2 5 0 0 2 -5 
TA3, + Az; > Ap; and 14A3: + A>. > Ap, and 1143, + Ap, > A, and 
—§A3, + Al; > Al; -A3, + Ai, > Ai; -3A3,; + Al: > Al; 
1 -{ 0 O 3 -l1 0 O 2 4 0 7 
0 i 0 i 0 14 O 21 0 -14 0 -21 
0 0 -1 3 Gi sie diy «5 OO: “5 
Step 6: Scale the row with the pivot so the pivot in that row is 1. Fractions at this point are unavoidable. 
—A3. > A3.: -5A3. md A3, 5A3. > A3.: 
1 -i 0 0 3 -1 0 0 2 4 0 7 
0 7 0 } 0 14 0 21 0 -14 0 -21 
0 0 1 -3 0 0 1 -3 0 0 1 -3 


Step 7: Take the submatrix consisting of all rows above the pivot and return to step 5. The third row of 


the matrix is fixed. 


1 -+ 0 
0 F 0 


NIN © 


3 -1 0 0 
0 14 0 21 


2 4 0 7 
0 -14 0 -21 


Step 5: The rightmost pivot postion is in column 2. To prepare for the row replacements in approach 
(1) there is nothing to do; in approach (2) the second row is scaled by 5 while the first row is scaled by 
2; and in approach (3) the second row is scaled by i. 


TA), > Ao; and 2A), > A); TA2; > Ad, 
6 —2 0 0 2 4 0 #7 
0 2 0 3 0 -2 0 -3 
FAr, + Ai, > Ai: Ar: + Al, > Ai; 2A>:+A1: > At; 


cc 
or 
wIN © 
oo 
NINNI- 
—— 


6 0 0 3 
0 2 0 3 


Step 6: Scale the row with the pivot 


so the pivot in that row is 1. Fractions at this point are unavoidable. 


2A. md Ao, 5A, > A: —5Ap, > A2,: 
10 0 2 6 0 0 3 202 B 1 
Gish: Os612 010 3 OO <3 


Step 7: Take the submatrix consisting of all rows above the pivot and return to step 5. The bottom row 


of the submatrix (second row of the 


original matrix) is fixed. 


<1) 0402-25 


[6 0 0 3] 


[2000 ay] 
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Step 5: The rightmost pivot postion is in column 1. There are no entries other than the pivot in that 
column, so there are no row replacements to do. Step 6: Scale the row with the pivot so the pivot in 
that row is 1. Fractions at this point are unavoidable. 

7A\,; > Al; 5A\, > Al, 


Step 7: There are no rows above the pivot, so we are done. At this point, our matrices are as follows. 
These matrices are all in reduced row echelon form. 


Notice that all three approaches produced the same reduced row echelon form. This is not an accident and this point 
will be picked up again soon. Writing the linear system corresponding to this reduced row echelon form, we see 


1 3. 5 


The original system has this one solution. 
A matrix containing the coefficients and constants of a linear system (one row for each equation, one column 


for the coefficients of each variable, and the rightmost column for the constants, as usual) is called an augmented 
matrix. A matrix containing the coefficients of a linear system (one row for each equation and one column for the 
coefficients of each variable, as usual) but not the constants is called a coefficient matrix. The coefficient matrix for 
any linear system is a submatrix of the corresponding augmented matrix. The augmented matrix simply has one more 
column—the column of constants for each equation. 

The system 


has augmented matrix 


3 -2 0 
5 -3 0 


3. -2 
js 3] 

When the constants of a linear system are all zero, it is not necessary to represent the system as an augmented 
matrix. The coefficient matrix will do. After all, a column of zeros at the beginning of the row reduction process will 
be a column of zeros at the end of the process. Row operations do not change the entries of a column of zeros. For 
example, after swapping two rows with zeros in their j” columns, the j“ column still has zeros in those rows. All 
that happened in column j was two zeros swapped places. The rest of the rows are not involved in the swap, so if their 
j'" columns held zeros before, they hold zeros after as well. Can you explain similarly why row replacement and row 
scaling also leave a column of zeros unchanged? Answer on page 61. A linear system whose constants are all zero is 
called a homogeneous system. Otherwise it is called a nonhomogeneous system. 


and coefficient matrix 


Crumpet 14: Homogeneous Linear Differential Equations 


A linear differential equation is homogeneous if its constant term is zero and nonhomogeneous otherwise. 
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Key Concepts 


augmented matrix A matrix holding a particular arrangement of all the coefficients and constants of a linear system. 


coefficient matrix A matrix holding a particular arrangement of all the coefficients but none of the constants of a 
linear system. 


homogeneous system A linear system with constants all equal to zero. 
nonhomogeneous system A linear system with at least one nonzero constant. 
pivot the leading entry of a matrix in echelon form 

pivot column a column containing a pivot 

pivot position the row and column of a pivot 


row reduction The process of using elementary row operations to transform a matrix whose associated linear system 
is not solved into one whose associated linear system is either solved or could be solved by back-substitution. 


row echelon form An arrangement of the entries of a matrix following completion of the first four steps of the row 
reduction algorithm. 


reduced row echelon form An arrangement of the entries of a matrix following completion of the row reduction 


algorithm. 
Exercises 1 0 0 
(e) | 0 -5 O 
1. Reduce the matrix to row echelon form. 0 0 2 
—2 -4 -10 5 0 -3 
$]-290 
(a) -5 2 -l | ] (f) | 0-4 1 | [S]-290 
e 1 -4 0 0 0 O 
3 5 4 5 3 0 0 
0 03 - 
6 -1 -4 (g) 000 0 [A]-349 
(i) |) 3d 000 0 
2 -=3. =1 
io 4. Solve the nonhomogeneous system by row reduction. 
(d)} 4 1 -3 | [S]-290 (a) -7y, - 2% = -6 
1 1 3 4x, + & = -l 
2. Reduce the matrix in question 1 to reduced row echelon v1 + 4y = -4 
form. [5 ]-290 (b) -3y, - Illy = -5 [S]-290 
3. The coefficient matrix for a homogeneous linear system 
is given. Find one nontrivial solution (not all variables (c) ee TY 8 
equal to zero) of the associated system if possible. As- a i in 
sume the variables of the system are x,,%2,...%, and ety 4 3 
their coefficients appear in that order in the matrix. (d) | (aie 20y ee iB [A]-349 
3-1 
(a) 0 5 | [S]-290 ® 2 7% = 3 
a —4x, + Sx = -3 
- | 0 2 | 45 8 6 
-45x - By = - 
-1 3 OY ie, 2y = 4 a ad 
©) | 0 0 | 
x + 3y - zg = -3 
2 —-2 5 (g) x + y + 2z = -5 
(d)| 0 7 -1 —3x - 10y + 42 = 6 
0 0 3 
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-3v) -— 35y. + 103 = 2 ear system. What can you say about the solution(s) of 
(hy) 94, + «130v. — 40v3 = 2 the system if augmented by the given column? 
Ov, + 120. - 3543 = -4 
[S]-291 3 
-y, + 2 + 83 = -6 (a) : 
(i) -Vy So v2 + 5v3 = -6 
—2v, + 2vy ++ «2113 = -1 0 
12x, - 9% - 14,3 = 3 (b) | ; | 
G) 24% - 15% - 21x, = -3 
—9x, + 6x2 +e 7x3 = 2 a 
[A]-349 (c) | b |; a,b,c are arbitrary real numbers 
2w - x + y + 42 = -3 . 
—-w + Xx - 3z = 5 
(kK) —w + x + y - 22 = -2 3.0 0 
= 7 = FS = 6. LetA =| 0 2 O J. What can you say about the 
0 0 -4 
xy - mm» + 43 - wy = 1 solutions of the associated system if 
1 —2x,} + 6x. - 10x3 + 5x, = -3 
() —Xx| + Ox, = Ix = 2 (a) the system is homogeneous and A is the coefficient 
3x, - 2m + Ile, - 2m = 6 matrix? 
[A]-349 . , 
(b) the system is nonhomogeneous and A is the coeffi- 
—2 0 O cient matrix? 
5. Let} O -3 O | be the coefficient matrix for a lin- 
0 0 2 (c) A is the augmented matrix? 
Answers 


associated system: 
3x -— 2y = 2 
Sx - 3y = 5 


column of zeros: (row scale) After scaling a row with a zero in its j" column, the j”” column still has a zero in that 


row since 0 times anything is 0. The rest of the rows are not involved in the scale, so if their j” columns held 


zeros before, they hold zeros after as well. (row replace) After adding a multiple of a row with a zero in its j” 
column, to a different row with a zero in its j” column, both rows still have zeros in their j” columns. 0 times 
anything is 0, so a 0 was added to the 0 in the row being replaced. Since 0 plus 0 is 0, the j” column of that row 
is still zero. The rest of the rows are not involved in the replacement, so if their j“” columns held zeros before, 


they hold zeros after as well. 
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2.3. Existence, Uniqueness, and Echelon Forms 
As augmented matrices, 


0 0 3 


represent fundamentally different linear systems. The first matrix represents the system 


ee gteaa, 


x=0 
2y =3 


which has exactly one solution: x = 0, y = 3. The second matrix represents the system 


x+2y=0 
0=3 


which has no solution. Even though the first equation has many solutions, the second equation will not be true for 
any values of x and y since 0 simply does not equal 3. The third matrix represents the system 


x+2y=3 
0=0 


which has infinitely many solutions, x = 3 — 2y with y arbitrary. One example is y = 1, x = 1. 
The three associated linear systems have different types of solution sets. The first set of solutions contains exactly 
one element. The second set of solutions is empty. The third set of solutions contains infinitely many elements. 


The three matrices 
7 10 -3 7 O 0 d 7 10 O 
oO 0 O° Woo ge 3] MP li a a3 


are similar to the first three in this way. One of them has an associated linear system with no solution, one with 
infinitely many solutions, and one with exactly one solution. Can you tell which is which? Answer on page 69. 

Comparing the first set of three matrices to the second set of three matrices, you may notice that all six matrices 
have three 0 entries and three nonzero entries, one of each per column. Further, there are only three arrangements of 
the zeros. Using # to represent the nonzero numbers, the arrangements are 


# # 0 


00 # and 


> 


# # # 
0 0 0 


# 0 0 
O # # 


corresponding to infinitely many solutions, no solution, and one solution. You can verify that each one is in row 
echelon form by applying the first 4 steps of the row reduction algorithm. The nonzero numbers may change, but the 
form (arrangement of zeros and nonzeros) will not! Additionally, no matter what nonzero numbers take the place of 
the #s, the number of solutions of the associated systems will not change. This is the real power of row echelon form 
matrices. 

More generally, some of the entries of these forms could be replaced by other symbols without disrupting row 
echelon form. Using x to represent any number (including zero), matrices with the following arrangements of entries 
are all in row echelon form 


, and (2.3.1) 


# x 
0 # 


xk 
O # 
These more general forms also have infinitely many, zero, and one solution(s), respectively. Imagining the associated 
linear systems will help you verify the number of solutions, but how can you tell they are in row echelon form? 

Step | of the row reduction algorithm is applied to the entire matrix and then to each submatrix containing all 
the rows below the last identified pivot. As such, this step simply identifies the pivot positions. In each of the given 
matrices, the leftmost column has a nonzero entry, so it is a pivot column. The topmost (row one) position in this 
column is a pivot position. Returning to step 1 with the submatrix containg “all rows beneath the pivot” (as required 
in step 4) in this case just means the second row: 


[o 0 oj, [0 0 #], and[o # x |. 
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The leftmost nonzero entry in the second row is therefore a pivot position. The pivot positions of matrices (2.3.1) are 


boxed below. = 
#\ *k and ok oO 
, 0 0 # |’ QO |#) x 


Step 2 of the row reduction algorithm requires that each pivot is nonzero. This is already the case so no action is 
required. 

Step 3 of the row reduction algorithm requires all entries below a pivot (and in the same column as the pivot) be 
zero. In all three matrices, the only pivot with entries directly below it are those in the 1,1-entry, and there are zeros 
beneath them in each case, so no action is required. 

Step 4 of the row reduction algorithm sends the algorithm back to the first step. It does not, by itself, provide for 
any changes to the matrix. 

What we can glean from analysis of the algorithm is that in row echelon form, 


#) *« * 
0 0 O 


1. every pivot is to the right of the pivots in rows above it, and 
2. all rows of zeros are below all rows with nonzero entries. 


Both of these facts are immediate consequences of the algorithm. When a pivot is identified, all entries below it in 
that column are made zero, so when returning to step 1 with the submatrix containing all rows beneath the pivot, the 
column with the previously identified pivot contains only zeros. The leftmost nonzero column remaining must be to 
the right! Requiring that the pivot be nonzero ensures that no row of zeros can appear above a row with nonzeros. 
As long as these two requirements are satisfied, steps 2 and 3 (the only ones of the first 4 that cause any change to a 
matrix) are satisfied, so the matrix is in row echelon form. 

Noticing that a pivot will always be the leftmost nonzero entry in a row makes determining whether a matrix is in 
row echelon form a simple matter. Identify the leftmost nonzero entry of each row. These are the pivots. Make sure 
they are all to the right of the ones above them. Then check to make sure any rows of zeros are at the bottom. 

Given this description of row echelon form matrices, there are four row echelon forms for 2 x 3 matrices besides 
those in (2.3.1). Can you find them? Answer on page 69. Try any other arrangement of Os, *s, and #s such as 

xk kk * 
Flo # x 


# # * 
0 # O 


and you will see that it is either a special case of one these seven forms or there is a substitution of numbers for which 
the matrix is not in row echelon form. Can you show this is true for these two matrices? Answer on page 69. 

A linear system with at least one solution is called consistent, perhaps deriving from the fact that the equations 
do not contradict one another. A linear system with no solution is called inconsistent. If a row echelon form of its 
augmented matrix has a pivot in the last column, the linear system is inconsistent. Indeed, a row containing a pivot 
in the last column has the form [ 0 0 -:- O # |, which translates to the equation 0 = # (zero equals a nonzero 
number), which of course is nonsense. Otherwise it is consistent. 

For a consistent system (no row echelon form matrix has a pivot in the rightmost column), the corresponding 
system can be solved by back-substitution. The pivots can also be used to determine the number of solutions. If a row 
echelon form of the system’s augmented matrix has a pivot in each column (beside the rightmost), the linear system 
has exactly one solution. Otherwise it has infinitely many solutions. The second case is worth closer inspection. 

All of the following augmented matrices are in row echelon form and are pivot-free in the rightmost column plus 
at least one other column. 


-3 5 —2 -!1 
eee ea 


-4 15 -7 3 
0 05 1 -!l 


This is enough to know they all have infinitely many solutions, but writing down those solutions still takes a little 
work. 
The linear system represented by the first matrix is 


xX, + 3x =5, 


a single equation in two variables. Solving for x,, this system has solutions of the form x, = 5 — 3x and x, arbitrary. 
This implies that we are free to choose any value for x2 and use the relation x; = 5—3.x2 to determine x;. For example, 
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we may let x. = 7, forcing x, = 5 — 3(7) = —16. So —16,7 is a solution. In this way, the equation x, = 5 — 3x, (or 
equivalently the equation x; + 3x2 = 5) identifies all the solutions of the system, so we could write {x, = 5 — 3x2} as 
the solution set. 

Looking back, it would have been just as well to solve for x2, yielding x2 = aa This formulation would suggest 


we are free to choose the value of x; and use the formula to determine x2. Doing so makes the solution set {x2 = al i 

Either variable may be treated as arbitrary, giving what appear to be two different solutions. If this makes you feel 
a little uneasy, you are not alone. Even if we were never to have seen the second solution, the first one may seem a 
little unsatisfying. We have a formula for one variable and an implicit understanding that the other is free to take on 
any value. Perhaps a more satisfying way to write down the set of all solutions is to return to the variable x2, for which 
we are free to assign any arbitrary value and let it be r (for r-bitrary). Doing this, we have x2 = r and x; = 5 — 3r for 
a solution set. In the spirit of linear algebra, this solution can be expressed in matrix notation as 


aH 
Habe] 


Take a moment to verify that these matrix representations are equivalent to x. = r and x, = 5 —3r. The last 
presentation of the solution should feel most satisfying. It does not favor one variable over the other and gives an 
explicit formula for the value of each variable. This form of the solution is called parametric vector form. Returning 
to the variable x, and setting it equal to r leads to a similar parametric vector form of the solution. Can you find it? 
Answer on page 69. 


or better still as 


Crumpet 15: Particular and Homogeneous Solutions 


The solution of the linear system 
xX, +3x.=5 


can be written as 


Pp 


where ie | , called the particular solution, is any one (particular) solution of the system and ae | , called the 
2 
Pp h 


X2 


homogeneous solution, is the solution of the corresponding homogeneous system, x, + 3x2 = 0. For example, ; } 


2 : : : : : 
2 | and 1 | are valid particular solutions. The homogeneous system has solution x; = —3x2, so any solution 


3 


3 
is a valid homogeneous solution. The set of all sums of one particular solution and one homogeneous solution is the 


that includes all instances where the first variable is —3 times the second, as in | e } ‘| y } or | ce } 


general solution of the linear system. See section 5.1. 


Putting the second matrix, 


—-3 5 -2 -l 
0 5 -4 O |’ 


into reduced row echelon form will facilitate writing the solution of the corresponding linear system. Subtracting row 
2 from row | and scaling each row appropriately yields reduced row echelon form 


1 0 -2/3 1/3 
O 1 -4/5 0 
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(you should verify this) suggesting that the simplest way to write the solution is 


1 2 4 
X= az trx3, X= TX. 


an) 5 


This solution further suggests that we let x3 be r-bitrary and write the solution as 


x] 1/3 2/3 
| x2 -| QO |+r} 4/5 | 
X3 0 1 


For a solution without fractions, we can let r = 10 + 15s (after all, r is arbitrary!), which gives 


ee 
Se OS 
oO Noe 
a 
Il 


1/3 2/3 
O |4+(€0+4+15s)} 4/5 
1 


0 
1/3 20/3 + 10s 
=| O {+ 8+ 12s 
0) 10+ 15s 
7 10 
=} 8 |4+s] 12 
10 15 


where s is the arbitrary variable. There will always be various ways to write the parametric form of the solution of a 
system with infinitely many solutions. 
Putting the third matrix into reduced row echelon form will facilitate writing down its solution as well. 


-4 15 -7 3 
0 05 1 -!l 


is reduced by subtracting the second row from the first and then scaling the rows appropriately. The resulting reduced 
row echelon form is 


1 1 
0 0 ge 5 
from which we conclude that x; = —1 + 5X2 — 2x4 and x3 = -i - £X4. Variables x; and x3 are easily written in terms 
of x2 and x4, suggesting we allow x2 and x4 to be arbitrary. Letting x. = r and x4 = s, it follows that x; = -—1+ ire 2s 
and x3 = -; - is, so 
1 
XxX] -1 4 0 
X2 1 0 
= 1 +r +8 1 
x3 5 0 =5 
X4 0 1 


Two variables may be set arbitrarily this time, a consequence of the fact that we have four variables and only two 
pivots. Each variable column without a pivot gives a variable that may be set arbitrarily, what are known as free 
variables. Variables represented by columns with pivots are called basic variables. Other forms of the solution may 
be obtained by letting different pairs of variables be the arbitrary ones or by making substitutions for the arbitrary 
r and s in the above solution, such as r = 4f¢ and s = 4+ 5u. Can you write the solution that results from this 
substitution? Answer on page 69. 

The observations that free variables lead to infinitely many solutions and a pivot in the rightmost column of an 
augmented matrix leads to no solution justify the existence and uniqueness theorem for linear systems. 


Theorem 1. [Existence and Uniqueness] A linear system is consistent if and only if the rightmost column of its 
augmented matrix representation is not a pivot column. Furthermore, a consistent system will have (a) exactly one 
solution if it admits no free variables; and (b) infinitely many solutions if it admits at least one free variable. 
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Thus the nature of the solution set for any linear system can be determined from a row echelon form of its associated 
augmented matrix. In problems where this is the entire question at hand, the row echelon form suffices, and can save 
a bit of time compared to using the reduced row echelon form. The reduced row echelon form, being a row echelon 
form itself, can be used for this purpose too, but better serves as a place from which to write down the solutions of the 
system. Thus in problems where the solution set for a linear system is needed, it is usually worth the time and effort 
to find the reduced row echelon form. 

Reduced row echelon form requires that the entries above pivots are zero (step 5) and that each pivot is | (step 6), 
so reduced row echelon form matrices are row echelon form matrices with these extra two properties. The reduced 
row echelon form of a 2 x 3 matrix will take one of the following seven forms, for example. 

1 0 
o}[o 


lx * 1 0 0 0 1 0 
0 0 O7}'|0 1 >) 0 °) 0 0 1)’ 


It is not by mistake that we refer to a row echelon form or the reduced row echelon form of a matrix. Row echelon 
form is not unique, but reduced row echelon form is unique (see crumpet 23 on page 160). 


* 1 * 0 1 * 
* 0 0 1 0 0 


> 


SageMath 


If Mis a matrix in SageMath, then M. echelon_form() returns a row echelon form and M.rref() returns the reduced 
row echelon form. The following code computes a row echelon form and the reduced echelon form of 


—480 -340 -110 -110 100 

242 146 54 54 —60 

M=)| -721 -673 -277 -152 155 
968 809. 316 «$191 —215 

—1039 -882 -268 -268 170 


M = matrix(5,5, [-480, -340,-110,-110,100,242,146,54,54,-60,-721, -673,-277, 
-152,155,968, 809, 316,191,-215, -1039, -882, -268, -268,170]) 

print("M =");print(); printQO 

print("Row echelon form:");print(M.echelon_form()); print( 

print("Reduced row echelon form:") ;print(M.rrefQ) 


£9) Sagelath Cell] 40 The output of the code is 


M = 

[ -480 -340 -110 -110 100] 
[ 242 146 54 54 =-60] 
[ -721 -673 -277 -152 155] 
[ 968 809 316 191 -215] 
[-1039 -882 -268 -268 170] 
Row echelon form: 

[ 1 13 87 462 45] 

[ © 25 25 150 25] 

[ 0 0 125 625 50] 

[ 89 0 9 750 150] 

[. 89 809 0 9 98] 

Reduced row echelon form: 

[ 1 0 0 @ -2/5] 

[ 90 1 0 ®@ 2/5] 

[ 90 0 1 0 -3/5] 

[ 9 0 0 1 1/5] 

[ 9 0 0 0 0] 
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Key Concepts 

basic variable a variable corresponding to the column of an augmented matrix with a pivot. 
free variable a variable corresponding to the column of an augmented matrix with no pivot. 
consistent (linear system) having at least one solution. 

inconsistent (linear system) having no solution. 

existence and uniqueness of solutions (for linear systems) see theorem |. 

parametric vector form a linear combination of vectors. 


reduced row echelon form of a matrix is unique. 


Exercises 16 0 -2 0 5 
001 3 0 1 
1. The augmented matrix for a linear system is given. (i) (c) 000 041 -2 
Determine whether it is in row echelon form. If it is, state 000 0 0 #0 
(11) whether the associated linear system is consistent or 1 0 0 
inconsistent, and (iii) how many solutions it has. @loo. 2 | [A}-349 
14 0 0 3 -1 0 1 0 4 
0 0 0 -1 2 #3 (e) | 00 1 -8 O 
1 0 -8 0 00 0 1 
b A]-349 
©) Oo 1 5 | un 10-9 0 4 
2 4 4 1 1 3 (f)| 0 1 #1 O 3 | [A]-349 
| © OC EA a | pee oS ae ke 
0 -5 0 3 =5 3 3. The augmented matrix for a linear system is given. (i) 
0 0 0 0 -5 4 Is the system homogeneous? If yes, (ii) find the solution 
302 -8 set, in parametric vector form if infinite. 
(d)} 0 1 1 9 [A]-349 10 6 O 
0 0 0 6 (a)| 0 1 -8 O 
00 0 1 
6 -5 4 
() 0 0 -3 1 1 0 0 
(b)| 0 3 1 0 
PO os 00 4 0 
(f) 3 0O -!l 
2 0 -2 0 
—-3 4 -1 -5 
(c) | 0 3 1 O | [A]-349 
3. -2 5 6 #7 0 O 0 
0 —2 0 7 -4 
®™lo o 1 1 0 aa 7 
0 0 0 18 37 (d) | O 2 -5 O | [A]-349 
00 1 0 
a= 2 4. The coefficient matrix for a homogeneous linear system 
(bh) | 0 -3 0 =I | [A}349 is given. Find the solution set. If the solution set is infi- 
oo 0 0 nite give your answer in parametric vector form. 
2. The augmented matrix for a linear system is given. (1) 1 6 2 
Determine whether it is in reduced row echelon form. If (a) | 2 10 -3 
it is, (ii) find the solution set, in parametric vector form 3 16 —-5 
if infinite. > 4 5 
(a) 1 7 4 (b) | O 2 -7 | [S]-291 
0 1 0 0 -2 #7 
1 0 0 5 6 7 -3 
(b)}| 0 1 O -2 (c) | 0 4 2 [A]-349 
001 -3 00 -1 


68 CHAPTER 2. ROW OPERATIONS 
3-6 8 (b) How many solutions does the system have accord- 
(d) | -6 12 “18 ing to Alex’s work? According to Bianca’s? 
: ‘ 3 (c) How many free variables does the system have ac- 
I 4 8 cording to Alex’s work? According to Bianca’s? 
(e) | -3 -10 -20 . . 
2 8 15 (d) Can they both be right assuming the columns of 
their augmented matrices both hold the coefficients 
2 1 -6 of x, y, z in that order? 
(f) | -4 -2 12 | [A]-349 : 
1 1 3 (e) Suppose the columns of Alex’s augmented matrix 
2 hold the coefficients of x,y,z in that order, but 
5. Use row reduction and a reduced row echelon form ma- Bianca’s columns hold the coefficients in the order 
trix to find all the eigenvectors. All eigenvalues for the x, z, y? Now can they both be right? 
matrix are given. 
11. The reduced row echelon forms for the coefficient ma- 
(a) 12-8 | A=0,2 trices of two linear systems of three equations in three 
15-10 variables are 
(b) ie a faqt-4 [A]-349 | 1 x | | 1 * 7 
0 O Lljfand} 0 0 O 
| 6 -4 16 0 0 0 0 0 0 
(c) 3-7 4 |;A=-6,-3 [S]-291 
6 2 -14 so both systems have infinitely many solutions. Is it pos- 
7 2 -1 sible the systems are the same? Explain. 
(d) | 20 6 4 | A=-2,0 12. A system of linear equations with fewer equations than 
=§ 2 <3 unknowns is sometimes called underdetermined. Sup- 
ose such a system were consistent. Explain why the 
6. Using 0, #, x notation, list all the posible row echelon a must ae an infinitenumber-of aa Al 
forms for a 3 x 3 matrix. 349 
iy WES pe aur about ee and eeuae 13. A system of linear equations with more equations than 
of solutions of the system associated with the described . : . 
enatrig® unknowns is sometimes called overdetermined. Can 
such a system be consistent? Illustrate your answer with 
(a) 4x7 coefficient matrix with 4 pivots an example (if yes) or an argument as to why not (if 
(b) 4x7 augmented matrix with 4 pivots no). [A]-349 
(c) 7 x 4 coefficient matrix with 4 pivots 14. 3) SageMath Cell Use SageMath to find (i) a row ech- 
(d) 7x4 augmented matrix with 4 pivots elon form (but not reduced row echelon form) and (ii) the 
reduced row echelon form of 
8. What conditions on the pivots of an augmented matrix 
would ensure the associated system had a unique solu- 2049-4548 -S511 -5177 = 6023 
tion? —4526 10252 916 11438  -13292 
9. What conditions on the pivots of a coefficient matrix =O0e) pee Lia AO) = 20a 
would ensure the associated system had a unique solu- =1585. 2868 263 21660 3097 
ion? —5781 12812 1211 14321 -16671 
10. Alex and Bianca are solving the same system of equa- [S]-292 
tions. For the reduced row echelon form of the aug- 
mented matrix, Alex got : 15. £3) Sagenath Cell] 42 Use SageMath to find the re- 
duced row echelon form of the augmented matrix, and 
1 0 * x write down the solution of the corresponding linear 
| O lk x | system. Assume the columns represent the variables 
0 0 0 O X1,X2,.X3, X4, X5 in that order. 
and Bianca got 27 -36 -4 2 4 58 
1 * " 12 -16 -2 1 2 27 
foo isl 15 -20 -4 2 4 42 
0 0 0 3 4 +2 12 +7 
aS 16. £3) Sage ath Cell] 43 Suppose the matrix in question 15 


(a) Is the system consistent according to Alex’s work? 
According to Bianca’s? 


is the coefficient matrix of a homogeneous linear system 
with variables x1, x2, x3, X4, x5, X6 and repeat the exercise. 
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Answers 
which is which? The three matrices 


7 10 -3 
0 0 O 


PO" -20 
E 10 eal ane 


have associated linear systems with infinitely many solutions, one solution, and no solution, respectively. 
other row echelon forms The other four possible row echelon forms for 2 x 3 matrices are 
O # x O # x se 0 0 0 
0 0 0]’|0 0 # j 0 0 0} 
cases where the first variable does not appear in either of the equations. The last two forms, where neither 
variable appears in either equation, have arguable applicability. 


O # 
, 0 0 


0 
0 


substitution The matrix | is in row echelon form for any substitution of numbers, but is a special case 


# x 
0 # O 
ae 


* * O x 
O # x 0 # x 0 # » | the 


pivot in the second column cannot be zero. 


. The matrix | 


; ; ae 0 
is not in row echelon form for the substitution | 


parametric vector form Setting x; = r yields x2 = , which is equivalent to 


or 
Paetiaes 
=] 5 [tr 1 
me 3 “3 
To make it look a little more like the previous solution, let r = —3s, which gives 
X{ _ 0 4s -3 
X2 = 3 1 : 
substitution 2 The solution with substitution is 
XxX] -1 ; 0) 
an ee eee ence es 
X3 75 ) 75 
X4 0 1 


-1 0 
0 0 
af ete es tul 
4 5 
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Matrix Algebra 


3.1 Properties of Matrix Operations 


Background 


If you ever thought that algebra should be renamed “find x” or “how to solve equations”, you are not alone. The 
study of algebra is largely concerned with solving equations. Linear algebra is considerably less concerned with 
solving equations, but it is still an important feature of the subject. There are many similarities between the rules that 
govern the manipulation of algebraic expressions involving real numbers and those governing the manipulation of 
expressions involving matrices, but there are significant differences, all worthy of recording. First, a few words about 
the arithmetic of real numbers. 

During the late 19” century, the European mathematics community set to the task of answering foundational 
questions about arithmetic. They debated the questions of how to define the natural numbers and the real numbers; 
how to define the operations of addition and multiplication; and just as importantly to what extent such definitions 
were useful. The German mathematician Hermann Giinther Grassmann (GraBmann) is generally credited with spark- 
ing the debate by showing that properties of the natural numbers that to that point had simply been taken for granted 
(such as the fact that a + b = b + a) could be proved from simpler principles. After a number of developments, 
the Italian mathematician Giuseppe Peano published his Arithmetices Principia [20] (Principles of Arithmetic, 1889) 
summarizing work to that point and adding his stamp in the form of his five axioms defining the natural numbers. 


Crumpet 16: Foundations of Analysis 


In 1951 Edmund Landau published the first edition of his Grundlagen der Analysis (Foundations of Analysis [15], 
available at https://b-ok.cc/book/2863641/855790, accessed Feb 9, 2021) where, based on the work of Giuseppe 
Peano, Richard Dedekind, Augustin-Louis Cauchy and others, he develops the arithmetic of whole, rational, irrational 
and complex numbers in a single volume. Peano’s five axioms defining the natural numbers appear on page 2 as 
follows. 


Axiom 1: 1 is a natural number. 

That is, our set is not empty; it contains an object called 1 (read “one”). 

Axiom 2: For each x there exists exactly one natural number, called the successor of x, which will be 
denoted x’. 

In the case of complicated natural numbers x, we will enclose in parentheses the number whose successor 
is to be written down, since otherwise ambiguities might arise. We will do the same, throughout this 
book, in the case of x + y, xy, x — y, —x, x’,etc. 

Thus, if 
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then 
Y=Sy. 
Axiom 3: We always have 
x 2 Il 
That is, there exists no number whose successor is 1. 
Axiom 4: /f 
cas) 
then 
=P, 
That is, for any given number there exists either no number or exactly one number whose successor is 
the given number. 
Axiom 5 (Axiom of Induction): Let there be given a set ® of natural numbers, with the following 
properties: 
I) 1 belongs to ®. 
Il) If x belongs to ® then so does x’. 


Then ® contains all the natural numbers. 


Kurt Friedrich Gédel’s incompleteness theorems, published in 1931 [9], essentially concluded the debate. He proved 
that any consistent axiomatic system sufficient to describe arithmetic on the natural numbers (including Peano’s five 
axioms) will admit statements that cannot be proven nor disproven from within the system. Despite this deficiency, 
Peano’s axioms are sufficient to define natural numbers and prove the familiar properties of the operations of addition 
and multiplication. With the comfort of knowing these facts rest on solid foundation, we will assume their veracity. 
That is not to say we will simply take them for granted, however. 

1+3 equals 4, and 3+ 1 equals 4. 7+9 equals 16, and 9+7 equals 16. The more general statement that a+b = b+a 
for any numbers a and b is called the commutative property for addition of real numbers. Though this property is one 
of the basic principles that can be proven based on even more basic principles, we will take the viewpoint that it had 
to turn out this way! The counting numbers, 1,2,3,..., represent how many of a thing we have (quantity). This is 
a fact engrained in our minds as we learn to count—at a very young age. Addition models what happens when two 
quantities are merged—another concept engrained in our minds early on in our mental development. If you add three 
apples to a basket initially holding one apple, afterward it will contain four apples. This merger is modeled by the 
statement | + 3 = 4 (one apple plus three more apples gives you four apples). Similarly if you add one apple to a 
basket initially holding three apples, afterward it will contain four apples. This merger is modeled by the statement 
3+ 1 =4. The fact that a pair of natural numbers can be added in either order with the same result is simply the way 
natural numbers work. Any mathematical axiom, theorem, or proof suggesting otherwise is simply not the number 
system we were taught as youngsters. 


Crumpet 17: An Interesting Addition Table 


If we completely abandon the usual notions of natural numbers (that there are infinitely many of them and that they 
represent quantities, for example), addition might be defined as follows. 


+ t 2 3 4 SF © 
t{2Z],o v4) ys ali 
Z*|o|i|ja|3s|/4)]2 
sila /4|/o]2] 1] 3 
ATS |S | 1 |e) 24 
S| 4] 3 | 2 | 1 | @ | sa 
OLI|A2 ls] 4] sa] 


This sytem of addition retains some of the familiar notions of arithmetic such as associativity and the existence of an 
identity (can you tell which symbol acts as the identity?), but not commutativity. According to this table, 1 + 3 = 4 
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but 3 + 1 = 5! Addition in this system is not commutative, but that does not make this number system “wrong”. It 
simply means this system, an example of a finite group, does not represent the numbers as we commonly understand 
them. It makes a poor model for the measurement of quantity. 


Some Properties of Matrices 


Table 3.1: Some Properties of Real Numbers 
For all real numbers a, b,c 


aatb=b+a Commutative property for addition 

b. (a+b)+c=a+(b+c) Associative property for addition 
c.a+0=a=O0+a Additive identity 

d. a(bc) = (ab)c Associative property for multiplication 
e. a(b+c) =ab+ac Distributive property 
f.l-a=a=a-1 Multiplicative identity 


Table 3.1 summarizes the properties of real numbers of interest to our study of linear algebra, each of which has a 
matrix analog as shown in the following theorem. An m X n matrix whose entries are all zero is called a zero matrix 
and is denoted 0,.x, or just 0 when its size is discernible through context. 
Theorem 2. [Algebraic Properties of Matrices, Part 1] For all matrices A, B,C 

I, A+ B=B+A (commutative property for addition) 

2. (A+B)+C=A+(B+C) (associative property for addition) 

3. A+0=A=0+A (additive identity) 

4. A(BC) = (AB)C (associative property for multiplication) 

5. A(B+C) = AB + AC (left distributive property) 

6. IA = A = AI (multiplicative identity) 
whenever the indicated operations are defined. 


Claim | can be proven by noting that 
(A + B);,; = Aij + Bij 


and 
(B + A)ij = Bij + Ajj 


by definition of matrix addition. Since A; and B;; are numbers, the commutative property for addition of real 
numbers allows us to use the fact that A; ; + Bj; = Bj; + Aj; to deduce that (A + B); ; = (B + A), ;. Since the entries of 
A+ Band B + A are equal, the matrices are equal. 

In more words than symbols, we might argue as follows. The i, j-entries of A+ B and B+A are calculated by adding 
the same two entries of A and B only in different orders. Since addition of real numbers (entries) is commutative, the 
i,j-entries are equal. Hence A+ B= B+A. 
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In more symbols than words, a third way to see that theorem 2 claim | is true, consider the following computation. 


Ait Aig t+) Ain Bur Bia ss Bin 
Ag, Az. ++: Aan By, Bor +++ Bon 
A+B= . : . + ; : . : 
Am, Am2 oe Amn Bn Bn ines Burn 
Air +Byr Ai2+Bi2 +--+ Aint Bin 
Aoi + Bo)  Ar2+Bo2 +++ Aan t+ Bon 
Am, te Bn Am2 + Bn2 eee Amn + Brin 
Byi tA BiatAr2 +++ BintAin 
Boi + Az) Bog +A22 +++ Bon +Arn 
Bn, + Am Bn2 Tr Am2 eee Bunun + Amn 
=B+A 


In essence, the fact that matrices obey the rule A + B = B + A follows directly from the commutative property for 
addition of real numbers and the fact that addition of matrices is computed component-wise. The rest of the claims in 
theorem 2 can also be justified by use of the corresponding property for real numbers and careful application of the 
definitions of matrix addition and multiplication. 

As noted in section 1.3 matrix multiplication is not commutative. The familiar commutative property for multipli- 
cation of real numbers, ab = ba, does not have a matrix analog. Thus it is important to point out, for example, that the 
distributive property for matrices holds for both left-multiplication (as in theorem 2 claim 5) and right-multiplication. 
Additionally, the interplay between scalar multiplication and both matrix addition and matrix multiplication must be 
documented, as seen in the following theorem. 


Theorem 3. [Algebraic Properties of Matrices, Part 2] For all matrices A, B,C and scalars r,s 


= 


r(A + B) = rA + rB (distributivity of a scalar over matrices) 
. (r+s)A =rA + SA (distributivity of a matrix over scalars) 
. (rs)A = r(sA) (associativity of multiplication between two scalars and one matrix) 


r(AB) = (rA)B = A(rB) (associativity of multiplication between a scalar and two matrices) 


Ww KR WN 


. (B+ C)A = BA + CA (right distributive property) 
whenever the indicated operations are defined. 


Again these claims can be justified by use of the corresponding property for real numbers and careful application of 
the definitions of matrix addition and multiplication. 

One final theorem for this section contains a list of identities concerning the matrix transpose and inverse. Claims 
concerning only the transpose can be proven by comparing i, j-entries as before while those concerning the inverse 
are most easily proven without reference to individual entries. 


Theorem 4. [Algebraic Properties of Matrices, Part 3] For all matrices A and B and scalar r 
1. (A)? = A (transpose of the transpose) 
2. (rA)’ = rA’ (transpose of a scalar multiple) 
3. (rA)! = ta! (inverse of a scalar multiple) 
4, (A+B)! = A! +B" (transpose of a sum) 
S| 


. (AB) = B’A? (transpose of a product) 
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6. (A7!)"! = A (inverse of the inverse) 

7. (AB)! = B-!A™! (inverse of a product) 

8. (AT)! = (A!) (inverse of the transpose) 
whenever the indicated operations are defined. 


To make the point that the claims involving inverses can be proven without reference to entries, consider theorem 
4 claim 6. In words it says the inverse of the inverse of a matrix is the matrix itself. Or in other words, if you find the 
inverse of a matrix, then find the inverse of that matrix you get the original matrix. More in the words of the definition 
of inverse (matrices A and B are inverses if and only if AB = BA = J), any statement about inverses can be rephrased 
as a statement about a product. To show that two matrices are inverses, it is often best to show that their products, in 
both orders, each equal the identity. As for claim 6, it suffices to show AA~! = A~!A = J (the inverse of A~! is the 
matrix B such that BA~' = A~'B = J), but that equality is true due to the definition of inverse—end of proof. The 
claim is more a matter of perspective than a claim of something new. 

While a list of 19 claims over 3 theorems may seem a bit overwhelming to digest, there are only a small few 
that require great attention. The claims of theorem 2 are replicas of properties of real numbers with which you 
are hopefully familiar. As such, it is the differences between the algebra of numbers and the algebra of matrices 
that should gain your focus. Primarily there is no commutative property for multiplication of matrices! This has 
consequences such as the appearance of claim 5 of theorem 3. The right distributive property is not necessary to 
enumerate separately for real numbers because it follows from the combination of commutativity for multiplication 
and distributivity of real numbers. The rest of theorem 3 is documentation of what you would probably expect to be 
true about scalar multiplication. 

In addition to the right distributive property, theorem 4 is worth careful scrutiny. It contains facts about the 
relatively new concepts of inverses and transposes. In particular, notice that (claim 5) the transpose of a product is 
the product of the transposes in the opposite order! Similarly, (claim 7) the inverse of a product is the product of the 
inverses in the opposite order. Justification for this claim is requested in exercise 17. 


Applications to eigenpairs 


As an example of the utility of the properties of theorems 2 through 4, the claim if v is an eigenvector of A associated 
with value A, so is cv of section 1.7, page 43, can now be justified. Assuming v is an eigenvector of A, we know that 
Av = Av (by definition). To justify the claim, we need to demonstrate that A(cv) = A(cv): 


A(cv) = c(Av) _ theorem 3 claim 4 
=c(Av) definition of eigenpair 
=(cA)v__ theorem 3 claim 3 
=(Ac)v_ commutative property for multiplication of real numbers 
=A(cv) theorem 3 claim 3 


Each algebraic manipulation must be supported by one of the theorems or a property of the real numbers. 

As a second example of the utility of these theorems, we are also prepared to prove the claim that if (A, v) is an 
eigenpair for the matrix M, then (M — Ad)v = 0, also from section 1.7. Technically the first line of the justification 
should itself be justified in some general form before it is used. Can you supply such a proof? Answer on page 77. 


0 = Mv- Mv _ the difference between a matrix and itself is a zero matrix 
= Mv- av _ definition of eigenpair (substitution of Av for Mv) 
= Mv- Av) | definition of identity matrix (substitution of Jv for v) 
= Mvy-(ADv _ theorem 3 claim 4 
=(M-ADv _ theorem 3 claim 5 


Key Concepts 
algebraic properties of matrices see theorems 2, 3, and 4. 


zero matrix an m Xn matrix whose entries are all zero, denoted 0,,., or just 0 when its size is discernible through 
context. 
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Exercises 


1. Illustrate the property by example. Then explain in your 


own words why it is true. 


(a) (A+B)+C=A+(B+C) (theorem 2 claim 2). 
(b) A+0=A (theorem 2 claim 3). 

(c) r(A+B)=rA+rB (theorem 3 claim 1). 

(d) (r+ s5)A = rA + SA (theorem 3 claim 2). 

(e) (rs)A = r(sA) (theorem 3 claim 3). [A]-349 

(f) (rA)B = A(rB) (theorem 3 claim 4). 

(g) (AT)? = A (theorem 4 claim 1). 

(h) (rA)’ = rA!? (theorem 4 claim 2). 

(i) (A+B)? = A? + B? (theorem 4 claim 4). [A]-349 


2. Justify the property by showing that the i, j-entry of the 


lefthand side of the equation equals the i,j-entry of the 
righthand side. 


(a) A+0=A (theorem 2 claim 3). 

(b) A(BC) = (AB)C (theorem 2 claim 4). [S]-293 

(c) A(B + C) = AB + AC (theorem 2 claim 5). 

(d) (r+ s)A = rA + SA (theorem 3 claim 2). 

(e) (rA)B = A(rB) (theorem 3 claim 4). 

(f) (B+ C)A = BA + CA (theorem 3 claim 5). 

(g) (A’)? =A (theorem 4 claim 1). [S]-293 

(h) (A+B)? = A? + B? (theorem 4 claim 4). 

(i) (AB)! = B'A? (theorem 4 claim 5). 

. Justify the claim by a string of equalities where each 

equality is supported by a definition or theorem or claim 

you justify separately. 
(a) (AB)! = B-'A7! (theorem 4 claim 7). [S]-293 
(b) (AT)! = (A7!)? (theorem 4 claim 8). 


1 O /a 


teta=| |, 2 1 5 


and B=| 


(a) Compute AB’. 
(b) Without any further computation, find BA’ and ex- 


plain how you got it. 


4 7 


ttA=| 3 5 


} [A]-350 


(a) Compute A™!. 


(b) Without any further computation, find (A)! and 
explain how you got it. 


. Let A = : ; . . . Calculate 3A + 7A without 
calculating 3A or 7A. [S]-293 
. LetA = 4 / fo } Calculate 2(3A) without 


calculating 3A. 


10. 


11. 


12. 


13. 


14. 


15. 


16. 


17. 


9 6 -10 3 
ea=|% She=| *, fame = 
-1 3 -li1 ; 
1 2 9 |: Calculate AC + BC without calcu- 
lating AC or BC. 
Let A = : Calculat 14) ithout cal 
_LetA=| (4 _s5 |: alculate 3 without calcu- 
1 
lating —A. 
ating 3 
9 3 -5 —-3 5 -6 
Let A = 1 11 3 ana =| 7 12 0 } 


Calculate (A + B)’ without calculating A + B. 


Calculate (AB)’ without calculating AB. 


0 -8].,_[ 10 -5 
@ a=| Sf, 5 fa-| 5 5 | 

9 3 ],_f[1 9 
w a=| %, io f2=| 2 = 


Calculate (AB)~! without calculating AB. 


3 2 -8 -3 
@ a=| 5 * be=| 3 a 


4 1 = 5 
w =| 5 > '2=| 1 =| 


Justify the claim 
(A+ B\((C + D) =AC+AD+ BC+ BD 
using a series of equalities, each one supported by a the- 


orem. 


Justify the claim by arguing that the 7,7j-entry on the left- 
hand side of the conclusion equals the i,j-entry on the 
righthand side of the conclusion. 


(a) If A = B then A —- B = O. (The conclusion is 
A-B = 0. Use the assumption that A = Bin 
your argument.) 


(b) If A—B = OthenA = B. (The conclusion is A = B. 
Use the assumption that A — B = 0 in your argu- 
ment.) 


Let A be an m X n matrix. Justify the claim by arguing 
that the i, j-entry on the lefthand side equals the i, j-entry 
on the righthand side. 


(a) AOnxe = Omxe for any positive integer ¢. 


(b) O¢xmA = Oexn for any positive integer ¢. 


Show that, for any matrix A, —A is the additive inverse of 
A. That is, show 


(a) -A+A = 0; and 
(b) A+(-A) = 0. [A]-350 
—A has the common meaning —1 - A. 


Justify theorem 4 part 7. [A]-350 


3.1. PROPERTIES OF MATRIX OPERATIONS 77 


Answers 


difference of a matrix with itself The general statement that if A is any matrix then A — A = 0 can be proven by 
noting that 
(A — A)i,j = Ajj — Ajj = 0 


for all entries (A — A); ;. 
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3.2. Matrix Equations 
The algebraic equation 
e= 75 
is commonly solved by adding 7 to both sides of the equation. The reason this works is because —7 + 7 = 0 (7 is 
the additive inverse of —7) and x + 0 = x (0 is the additive identity). In linear algebra there is an additive identity 


matrix (theorem 2 claim 3) and there are additive inverses (section 3.1 exercise 16), so analogous equations ought to 
be solvable similarly. Indeed they are! The matrix equation 


7 4 5 = 
x-| 3, © |-[2 id Gor 


can be solved by the same process. Adding | ee ‘ to both sides of the equation gives 
x_ 7 1 2 7 1] | 5 -3 7 1 
-3 8 -3 8| |2 4 -3 8 
X4 O O} | 54+7 -3+1 
0 0} | 2-3 4+8 
12 -2 
= | -1 12 | 


Substituting “162 


-1 12 
The slightly more advanced equation 


for X in equation (3.2.1) yields a true statement, so | is a solution. 


8x-7=5 


is commonly solved by first adding 7 to both sides and then dividing both sides by 8 (or equivalently multiplying both 
sides by - the multiplicative inverse of 8). This might be demonstrated as follows. 


8x-7=5 
8x-7+7=5+7 
8x = 12 

8x 12 

8 8 

3 

se 

2 


This method works because, in addtion to the facts that -7 and 7 are additive inverses and 0 is the additive identity, 
8 = | (8 and : are multiplicative inverses) and 1x = x (1 is the multiplicative identity). In linear algebra, there are 
multiplicative identity matrices and there are multiplicative inverses (section |.6), so analogous equations ought to be 


solvable similarly. Indeed they are! The matrix equation 
9 4 x- 7 1 2 5 -3 
4 2 -3 8 2 4 


can be solved by the same process. Adding | to both sides of the equation and then (left) multiplying both 


7 1 
-3 8 
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-1 
sides by| ; : | gives 


ale Ga Dols Se ale, A 
4 2 -3 8 -3 8| |2 4 -3 8 
9 4 ig = 
fe k= 5 12 
9 4]'[9 4 eele a ial eee 
4 2 £2 oa 2 + 
| I> 520 ||: 4 |x-| b <3 es | 
-2 3 || 4 2 -2 3 1 12 
1 0 14 -26 
fot fe=| a oe 
14 -26 
x-[ 2 oe 


me 14 —26 ; is ; : 14 —26 
Substituting | 5758 | for X in the original equation yields a true statement, so] _57 58 
2 “2 
It seems as though the familiar ideas of adding, subtracting, multiplying, and “dividing” both sides of an equation 
are valid steps in solving matrix equations, but better not to take it for granted on the back of a pair of examples. In 
adding a matrix to both sides of an equation (or subtracting a matrix from both sides of an equation) in an attempt to 
solve it, we are using the principle that for matrices A, B, C 


| is a solution. 


ifA = BthenA+C=B+C 322) 


whenever the indicated operations are defined. Since the veracity of this proposition is critical to the logical validity 
of the solutions above, solid proof is warranted. 

The principle of equality suggested in exercise 14 of section 3.1 is useful. It says that A = B if and only if 
A — B= 0. That is, if we know that A = B, we can safely say that A — B = 0. And if we know that A — B = 0, we can 
safely say that A = B. Hopefully that sounds logical whether you have completed exercise 14 or not. 

Getting back to proposition (3.2.2), note that it begins with the assumption that A = B. By the principle of equality 
we can immediately deduce that 0 = A — B. Now because C — C = 0 for any matrix C and 0 is the additive identity, 
we can proceed as follows. 


O0=A-B 

=(A-B)+0 

=(A- B)+(C-C) 

=((A-B)+C)-C 

=(A+(-B+C))-C 
=(A+(C- B))-C 

=(A+C)-B)-C 

=(A+C)+(-B-C) 
=(A+C)-(B+C) 

We have used associativity and commutativity for addition of matrices as well as the distributive property. Now that 


we have 0 = (A+ C) — (B+ C) we conclude thatA +C = B+C. 
Equally critical to the second solution above is the principle that for matrices A, B, C 


if A = Bthen CA = CB (3.2.3) 


whenever the indicated operations are defined. The proof is very similar to the proof of (3.2.2). Starting with A = B 
allows us to proceed with 0 = A — B, but this time we employ the fact that CO = 0 for any matrix C (exercise 15 of 
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section 3.1): 
O=A-B 
= C(A - B) 
=CA-CB 
and therefore CA = CB. It is also true that 
if A = B then AC = BC (3.2.4) 


whenever the indicated operations are defined. Why is this claim needed? Can you justify this claim? Answers on 
page 84. 


Symbolic equations 


The need to solve a wholly symbolic equation often arises in the study of mathematics. Principles (3.2.2), (3.2.3), 
and (3.2.4) are often used to solve such equations. Suppose XA — J = B and we are interested in solving for X, for 
example. The process is the same as we would use if all the symbols represented numbers. We isolate the X by adding 
to or multiplying both sides of the equation by appropriate matrices: 


XA-I+I=B+I 


XA=B+I 
XAA7! = (B+ DA7! 
X=(B+DA"! 


Notice the careful right-multiplication of both sides in the third line. It is not valid to left-multiply one side of the 
equation while right-multiplying the other. Also, we should note that this solution is only good as long as A is 
invertible! 


The most important equation in linear algebra 
For essentialy the entire rest of this textbook, we will be concerned with solving equations of the form 
Mv=b (3.2.5) 


for v. Symbolically the solution is straightforward when M is invertible. Using associativity of multiplication, 
principle (3.2.3), and the definitions of inverse and identity matrices: 


M"'(Mv) = M"'b 

(M"!M)v = M'b 

Iv=M"'b 
v=M'b (3.2.6) 

but understanding it and its ramifications is not. Plus, what if M is not invertible? 

On page 51 it was discovered that the product of a matrix A left-multiplied by an elementary matrix EF could be 
calculated one row at a time by noting that each row of EA is the linear combination of the rows of A with coefficients 
from the corresponding row of E. In symbols, row r of EA can be computed as (FA),. = E,.;A1,, + E,2A2; + +++ + 


E,.,A,,,. But this computation holds for any matrix product. Given any matrices B and A where BA is defined—that 
is, B has the same number of columns as A has rows, say n—row r of BA can be computed as 


(BA), = B,iAt; + B,2A2, ze BrnAn: (3.2.7) 


This fact is helpful in understanding matrix products such as the one in (3.2.5). For example, suppose the third row 
of M is twice the first row of M. Then the third row of b must be twice the first row of b. From (3.2.7) 


bi. = Mi1V1:;: + Mi 2V2,: a a Mi nn: 
b;. = M311. + M32V2,: a M3.nVn,: 
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but M3. = 2M,., which means M3, = 2M, 1, M3. = 2M) 9,...,M3, = 2M)», so 


b3.. = 2M11V1,; + 2M) 2V2: a ay 2M1 Vn, 
= 2(M 1.141; a Mi) 22, pees Minn) 
= 2b. 


As a consequence, for such a matrix M, equation (3.2.5) can only have solutions if the third row of b happens to be 
twice the first row of b. There are choices of b for which the equation has no solution! 

Exercises 9 through 11 of section 1.5 provide evidence that any matrix M in which some row is a linear combi- 
nation of the others (this is certainly the case for a matrix where the third row is twice the first) will have determinant 
zero. Formula (1.6.2) suggests that matrices with zero determinant are not invertible. Stringing all this together, it 
seems the following concepts are interconnected. 


e Some row of &M is a linear combination of the others. 

e M has determinant zero. 

e M is not invertible. 

e The equation Mv = b has no solution for certain choices of b. 


We are not quite ready to prove the connection, but the pieces of the puzzle are falling into place. Taking a slightly 
different perspective on matrix multiplication will add one more item to the list. 

Just as we can imagine the product of two matrices as a collection of linear combinations of the rows of the 
righthand matrix, we can also imagine the product as a collection of linear combinations of the columns of the 
lefthand matrix. Thinking generally again, suppose we are given an arbitrary pair of matrices B and A where BA 
is defined—that is, B has the same number of columns as A has rows, say n. By definition the i,c-entry of BA is 
BijAie + BigA2¢ +++: + BinAn,. Swapping the order of each product, (BA)j,¢ = A1,¢Bi1 + A2cBi2 +++*+AneBin. In 
particular, 


(BA) 1c = AicBi1 + A2-Bi2 See eats AncBin 
(BA)2,¢ = Ai B21 + A2,-B22 a een An Ban 


(BA) nc — AicBmA + A2,cBm2 Ae ee AncBnnn (3.2.8) 


Reading these equations together (as columns of numbers) from left to right, they imply that column c of BA (the 
lefthand sides of the equations) equals A;,. times column | of B (the first terms of the righthand sides) plus Az. times 
column 2 of B (the second terms of the righthand sides) plus A3,. times column 3 of B (the third terms of the righthand 
sides), and so on. In symbols, 


(BA)\ By Bip Bin 

(BA)2,¢ Bo) Bop Bon 
Fi = Aj : + Are i ee + Ane 

(BA) nc Bn, 1 Bn2 Buin 


In other words, column c of BA is a linear combination of the columns of B where the coefficients for the linear 
combination come from column c of A. In short, 


(BA)..¢ = AjcB.) + AocBig ++++ + AncBzn- (3.2.9) 
In the special case of equation (3.2.5), 
(MYV)..¢ = VieM:1 + V2,cM.2 + +++ + VncM:n 
but v is a vector (column matrix) so it has only one column. Accordingly, Mv has only one column and we can write 


Mv =v11M. +v21M.2 tee + Wn Men. (3.2.10) 


82 CHAPTER 3. MATRIX ALGEBRA 


Revisiting the case where the third row of M is twice the first, this means 


b= Mv= vii.) + v2,1M.2 + v3,1(2M.1) ape S880 Vn Men 
= (Vi oe 2v3,1)M:1 + v21M.2 aca Vn tM. n 


Letting vj, = —2, v3, = 1 and v,; = 0 for all j not equal to 1 or 3, it turns out b = 0. In other words, the associated 
homogeneous equation Mv = 0 has a solution where v # 0, a nontrivial solution. Finally, we add 


e My = 0 has a nontrivial solution. 


to the list of related concepts. But wait, there’s more! 


Equation (3.2.8) has the form of a system of linear equations with variables Aj,-,A2,¢,...,Anc. Can you show 
that the equation Mv = b is equivalent to a linear system of equations whose augmented matrix is [ M b | and 
variables are V1,1, V2,1,.--,Vn,1? Answer on page 84. That makes the last item in our list of related concepts 


e The linear system represented by the augmented matrix [ M b | has no solution for certain choices of b. 


Wow. Apparently there are six ways of understanding the same phenomenon. 


Key Concepts 
addition property of equality for all matrices A, B,C if A = B then A + C = B+ C whenever the sums are defined. 


left multiplication property of equality for all matrices A, B,C if A = B then CA = CB whenever the products are 
defined. 


matrix form of a linear system if v is a (variable) vector with n entries, the matrix equation Mv = b is equivalent 
to the linear system with augmented matrix [ M Db | and variables V1.1, V21,..-5Vn1- 


matrix product as a linear combination of rows given matrices A and B, if BA is defined then row r of BA can be 
computed as a linear combination of the rows of A using row r of B as coefficients: 


(BA),.: = B,iA\; = B,2A2, ape eS ote By nAn,:. 


matrix product as a linear combination of columns given matrices A and B, if BA is defined then column c of BA 
can be computed as a linear combination of the columns of B using column c of A as coefficients: 


(BA). = Ai cB: + Ao B:2 a tee AncB:n- 


. : : T 
In the special case of a matrix M times a vector v = [ Vivo «tt Vy | : 


Mv = v\M.) + v2M.2 + +++ + V,M-n. 


nontrivial solution a solution v # 0 of the equation Mv = 0. 


right multiplication property of equality for all matrices A, B,C if A = B then AC = BC whenever the products 


are defined. 
Exercises 0 -19} y_]| -2 19 
©) + X=] 14 99 | [S}294 
1. Solve 

(d) 5X + 1 0 |-| 3 =] 

@ x+| 73 a ee | Boi 2 = -5 

a = 

19 6 16 9 2. Solve 
=4 37 _f 12 -20 1 0 -18 2 
©) -14 3 +x-| -16 0 | eee (a) 2 i x-| 2 <8 | 
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5 2 13. -13 2 2 1 =) 
w) | 63 |x-| -19 7 | (sr294 a ee 2 | a 
als. a 4 23] 2 > 
7 2 -13. 18 |_| -20 14 =) 5 
© | AP |-ell “10 |-| i542 | a eee | 
350 4 -1 -1 -3 
-13 1 -5 9 -16 4 0 4 -3 
@ | 19 4 |-| <3 & |x-| 1 -6 | | a | {+45 | 
3. Solve for the specified variable. Assume all indicated op- ou =e 
erations are defined. Make a note whenever you assume -4 5 -3 -4 =3. 3 
a matrix is invertible. (c) 0 3 3 2 | . = [$]-294 
(a) XYZ = Bfor Y 4 1 4 -5 0 1 
(b) XYZ = B for Z [A]-350 
302 3 0 4 -3 
Sr Aer ats Cied (@) | a Aon | 1-4 5 | [A}-350 
(d) PDP"! =A for D [S]-294 3 20 1 


(e) 2A(B"! + C’) = D for C [A]-350 
(f) BC)’ +2B! =A for B 
4. Write the matrix equation as an equivalent linear system. 


6 2 19 
Ores -14 1 -10 | 


[x om x |se=[ 7 ali 


cA = | 


ak = 37 
(b) Ax = 0; A = | -118 9 109 |; x = 
o6. 3 93 
T 
[xy ef) 


(c) Tr; = %; T -| 


-8 
| 9 | (44350 


b; M = 


oly 
a 

anF 
| | 
Bare 
NB 
eee 
< 

I 

——S= 
=< = 
ys 
— 


5. Specify the matrix M, vector v, and vector b so that the 
matrix equation Mv = b is equivalent to the linear sys- 


tem. 
-12x - o6y - 7 = 16 
(a) - Sy + 18 = 2 
-15x + l0y + 8 = -I1 
-15x, + 4x = -14 
(b) —4x, = 17x = 7x3 = 3 
ic) 15r - lls = O 
 3r + 10s = 0 
14y,, - I7y = -Il 
(d) 2; = -4  [A]-350 
ov, = 2V2 = -8 


6. Compute the second row of the product (if it exists) with- 
out computing the rest of the product. 


10. 


11. 


12. 


13. 


14. 


. Compute the third row of the product in question 6 (if it 


exists) by summing an appropriate linear combination of 


row vectors. [S]-294 [A]-350 

. Compute the second column of the product in question 
6 (if it exists) without computing the first. [S]-295 [A]- 
350 


. Compute the third column of the product in question 6 (if 


it exists) by summing an appropriate linear combination 
of column vectors. [S|-295 [A\]-350 


Suppose the third column of A contains all zeros. What 
can you say about the third column of BA? Why? As- 
sume BA is defined. 


Suppose the second row of B contains all ones. What can 
you say about the second row of BA? Why? Assume BA 
is defined. 


Suppose the fifth column of A is three times the second 
column of A. What can you say about the fifth column 
of BA? Why? Assume BA is defined. 


Demonstrate that the zero product rule, which holds for 
real numbers, does not hold for matrices. That is, show 
that the claim “if A and B are matrices such that AB = 0, 
then A = 0 or B = 0” is false. Do so (four times over) by 
providing examples of matrices A and B such that A # 0 
and B # 0 yet AB = 0 in each of the following cases. 


(a) A and B are nonsquare matrices. 
(b) A is square but B is not square. 
(c) Bis square but A is not square. 


(d) A and B are both square. 


Argue that if AB = 0 then one of the following must be 
true. 


e detA=0 
e detB=0 
e A is not square 


e Bis not square 


Use the fact that for a square matrix M, det M = 0 if and 
only if M is noninvertible. 
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15. Show that the converse of (3.2.2) is true. That is, justify 16. Show that the converse of (3.2.3) is false by supplying 
the claim that for all matrices A, B,C, ifA+C=B+C matrices A, B,C such that CA = CB but A # B. 
then A = B whenever the sums are defined. 


Answers 


multiplication property of equality part 2 Claims (3.2.3) and (3.2.4) are distinct, and therefore both needed, be- 
cause matrix multiplication is not commutative. Claim (3.2.4) can be proven as follows. Since A = B,0 = A-B. 
Hence, if A = B then 


O=A-B 
245 jC 
= AC — BC 


and therefore AC = BC. 


Mv = basa linear system Let M be an m x n matrix and suppose Mv = b (making v a vector with n entries and b 
a vector with m entries). Then 


Mii Miz --- Min Vit 

Mo, Mo. +> Moan V2.1 
Mv = : : : . 

Min Min ae Minn Val 


Mi 1Vi1 + My 2v01 +--+ + Mi nVn1 
Mavi + Mo2V21 +++ + MonVn1 


MiniVi “F Mn2V2,1 2 a MinnVn1 


setting this vector equal to b gives 


My1V1,1 + My 2V21 ++°° + My nVn1 bi 
M21V1,1 + Mz2V21 + +++ + MonVn1 b21 
MiniVii of Mmn2V2,1 etek MmnVn,1 Dnt 


which can only be true if corresponding entries are equal. In other words, 


Myivin + Mi 2Vo1 +++++ MinVar = dis 


Mo1V1.1 + Mo2V21 +++ + ManVna = bo1 


MmaVi + Mm2V2,1 Sea MmnaVn1 = Dia 


Therefore, the equation Mv = b is equivalent to the system with augmented matrix [ M b | and variables 
V1.1, V2,1>+++> Vn1- 
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3.3 Linear Independence 


The matrices 


3 3 -5 
A=| 5 5 |-28-| 4 2 7 | e= -8 2 0 |, 
5 -4 7 
-11 10 -5 3 7 =I 
pal 8 =) 6 6 Oo 4 


—2 8 7 -9 -15 -3 
8 5 -10 -7 5 2 


have something in common. Each matrix has a column that can be written as a linear combination of the other 
columns in the matrix. Not all matrices have this property, and there is an important distinction between those that do 
and those that do not. 


-8 


Compare E = ; 5 to A, for example. In F, neither column is a multiple of the other so neither column can 


be written as a linear combination of the other. In A, the second column is —2 times the first: 
e A:2 = —2A:1 


You may be struggling a little bit to see this as a linear combination, but nothing in the definition of linear combination 
requires more than one term. So if an object is a multiple of another it is a linear combination of it. 

Notice that det A = 0 while det E # 0. The matrix with one column that can be written as a linear combination of 
the others has 0 determinant while the matrix whose columns can not be written as linear combinations of the others 
has nonzero determinant. We made a similar observation about linear combinations of the rows of a matrix and its 
determinant in section 1.5. 

For matrices B, C, D it is less clear that one column is a linear combination of the others, but you can check that 


e B.3 = 2B. _ B. 
e C.1 = —4C.2 + 3C.3 
e D.5 = 2D.) + 2D. + OD.3 + 3D.4 + OD.6 


Don’t be misled by the suggestion that “one column” is a linear combination of the others, however. It is true, but does 
not tell the whole story. In no case is the column written in terms of the others special. For matrix A, for example, we 
could have easily pointed out that the first column is -5 the second. The first column is a linear combination of the 
second, and the second column is a linear combination of the first. Neither one should take precedence. 

A little algebra will show that all the following equations are also true. 


e B.2 = 2B. = B.3 


1 1 
B.) = 7B.2 + 7B.3 


1 4 
C:3 = C1 + 3C.2 


e 
a 
ll 


1 3 
—4qCu1 a 7C.3 


3 1 
D.4 = —D.2 _ 5D. 4 ae 3D. 5 
e D.2 = -D.) = 3D.4 =F $D.5 
2 2 1 
e D.4 _ -3D.) = 3D.2 + 3D. 5 


The entire set of columns involved (with nonzero coefficient) in the linear combination is special. Any such column 
can be written as a linear combination of the others. 

To emphasize that the set of columns is special, not that one particular column within the matrix is special, each 
of the equations above can be rearranged so one side of the equation becomes 0. As a result, instead of having the 11 
equations above, where in each case one column is spotlighted as the “special” column being writen in terms of the 
others, we have 
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e 2A.;+A.2 =0 

e 2B.; — B.2 — B.3 =0 

e C.1 + 4C.2 -— 3C.3 = 0 

e 2D.); + 2D.2 + 0D.3 + 3D.4 — D.5 + 0D. 6 = 0 


The fact that columns within the matrix can be written as linear combinations of the others is captured, but no 
particular column is prominent, motivating the following definition. 
Let S be a set of objects on which addition and scalar mutliplication are defined and which contains an additive 


identity, called 0. For scalars x),x2,...,X, and objects b;,b2,...,b, of S, we say that b,,b2,...,b, are linearly 
dependent or that {b;, b2,...,b,} is a linearly dependent set if there is a solution of 

Xb, + xXgb2 + +++ + Xpby = 0 (3.3.1) 
where not all the x; are zero. Such a solution is called a nontrivial solution. Otherwise the objects b;, b2,..., bp 


are linearly independent and {b,, b2,...,b,} is a linearly independent set. Note that the 0 in (3.3.1) is the additive 
identity, not necessarily the number 0. 

In addition to being a statement about the set of objects rather than one special member of the set, this definition 
handles the case when one of the objects in the set is the 0 object (additive identity) itself. In this case, that particular 
object is special. It can be written as a linear combination of the others (with all coefficients equal to zero) but it is 
not necessarily the case that any of the other objects can be written as a linear combination of the remaining ones. To 
illustrate, suppose 


1 -4 0 
E=|2 5 0O 
3 -l 0 


The third column is a zero vector. Accordingly, 
E.3 = OE. + OE. 2 


so the third column is a linear combination of the first two. However, neither of the first two columns is a linear 
combination of the others. This is clear since the first two columns are not multiples of one another. In the context of 
the definition, we have 

OE.) a 7 Oe) + E.3 = 0, 


a nontrivial linear combination (not all of the coefficients are zero) of the columns that sums to 0. No fanfare. No 
notes of special cases. The definition of linear dependence is clean and direct. 
Matrix Characterization Part 1 


Free variables, solution sets of linear systems, and pivot positions of matrices are all directly connected to the concept 
of linear dependence. 


Theorem 5. [Characterization of Matrices Part 1] Suppose M is an m X n matrix, v has n entries, and b has m 
entries. Then the following are equivalent. 


(i) The columns of M are linearly independent. 
(ii) No column of M is a linear combination of the others. 
(iii) Mv = 0 has only the trivial solution. 
(iv) M has a pivot position in every column. 
(v) There is a matrix L such that LM = I. 
(vi) Mv = b has at most one solution for each b. 


(vii) Mv = b has no free variables. 
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The following list of arguments will show that if one of the statements is true, so is another...and if that one is true so 
is a third...and if that one is true so is the next...and so on until all the statements have been justified. Such a series of 
justifications means that if the first statement is true, they are all true since they all followed logically from the first. 
Proving they are equivalent requires one more step. The last statement will be shown to imply the first, completing 
a logical path from any one of the statements to any other. Closing the loop this way means that if any one of the 
statements is true, they are all true, the very meaning of equivalent! 


Crumpet 18: Proof by Contraposition 


Suppose you are trying to prove that if some statement, call it p, is true, then some other statement, call it g is true. In 
short, you are trying to prove that if p is true then q is true. Then it is just as good to prove the contrapositive claim, 
if q is false then p is false, because if the contrapositive is true then it is impossible to have both g false and p true. In 
other words, if p is true so is g because q cannot be false at the same time p is true (and that means if p is true then g 
is true). By similar logic, if any statement in a list of equivalent statements is false, they are all false. 


The following arguments demonstrate that (i) => (ii), (ii) => (iii), (iii) > (iv), Gv) > (Vv), (v) > (vi), (vi) > (vii), 
(vii) > (iii), and Gii) => Gd). More succinctly, (i) > Gi) > Gii) > (iv) > (Vv) > (Wi) & (vii) > Gili) => @), and 
diagrammatically, 


(i) Gv) = (vy) 

aN We iV 

di) => (Gi) = (vii) = (i) 
The diagram illustrates a logical path from any one of the statements to any other. Therefore, justifying each statement 
as claimed shows that the statements are equivalent. 


(i) > (ii) Requested in exercise 20. 


nos : T 
(ii) => Gi) Suppose Mv = 0 has a nontrivial solution, v = [ Xp X2 «t+ Xp | . Then Mv = x,M.) + x2oM.2 + 
--+ + x,M., = 0, and since v is a nontrivial solution, one of the entries of v is nonzero, say x;. Therefore, 
x;M.; = —x,M. a Xj-1M.j-1 = Xi+1 Mist ot ot aa XnM. », and more to the point, 
x) Xi-1 Xi+1 Xn 
M.; =-—M.) -...- M.j-1 - Mis) — --.- Min 
Xi Xj Xj Xj 


making column i a linear combination of the other columns. 


(iii) > (iv) Suppose M does not have a pivot position in every column. Then Mv = 0, a consistent system with 
solution v = 0, has free variables. By theorem |, Mv = 0 has more than one solution. 


(iv) = (v) Let R be the reduced row echelon form of M. Because M has a pivot position in every column the first 
m rows Of R, Ri-m,,, must be the m x m identity matrix. Because R = E;--- E,E,M for some n X n elementary 
matrices E), Eo,...,E,%, we have R = EM where E = E;--- ExE,. Hence Ej-m:M = Rim, = I. Let L = E}-m,. 


(v) > (vi) Suppose LM = J and Mv = b. Then L(Mv) = Lb => (LM)v = Lb = Iv = Lb = v = Lb. Hence Mv = b 
has exactly one solution, v = Lb. So Mv = b either has zero or one (in other words, at most one) solution. 


(vi) > (vii) Suppose Mv = b has a free variable (for every b) and letb = M.;. Thenv=] 1 0 O -:-- O I is 


a solution of Mv = b so Mv = bis consistent. By theorem | a consistent linear system with free variables has 
infinitely many solutions. 


(vii) => (iii) Suppose Mv = b has no free variables (for any b). Then Mv = 0 has no free variables and is consistent, 
having v = 0 as a solution. By theorem |, Mv = 0 has exactly one solution, the trivial solution. 


T 
(iii) > (i) Let v = [ VypoVo «tt Vy | . By assumption, Mv = v1 M.1) +v2M.2+-+++v,M., = 0 has only the trivial 
solution. By definition of linear independence, M., M.2,--- , M.,, (the columns of M) are linearly independent. 


Later, we will see that the determinant, row equivalence, invertibility, and function concepts are also directly con- 
nected to these statements. 
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Key Concepts 
characterization of matrices see theorem 5. 


equivalent statements a list of statements such that if one of them is true they are all true. 


linearly dependent objects b),b2,...,b, are linearly dependent whenever {b,, b2,..., by} is a linearly dependent set. 
linearly independent objects b, b2,...,b, are linearly independent whenever {b,, b2,...,b,} is a linearly indepen- 
dent set. 


linearly independent set a set that is not linearly dependent. 
linearly dependent set a set for which a nontrivial linear combination of its elements equals the additive identity. 


nontrivial linear combination a linear combination in which not all the coefficients are zero. 


nontrivial solution any solution x), x2,...,%, of the equation x,b, + x2b2 +--+ + x,b, = 0 where not all the x; are 
zero. 
trivial solution the solution x; = x. = --- = x, = 0 of the equation x, b; + x2b2 +--+ + Xpby = 0. 
Exercises 2) = Ve Se Sy 
(c) Tv, + 2 = b, [S]-296 
1. Show that the vectors are linearly independent. —4y) -— vw + 3 = db; 
(a) 5 = [s]-295 x + 5y + Tz bd 
4}'| - (d) 6x + Ty + 2z = bo [$]-297 
1 a] -6x + 3y - z = by 
“(H) 5x + y = b 
= -3x + y = dD 
© oy 4 b 
(c) | 2 |, [S]-296 x + 4y = by 
-l —XxX| = b, 
3 -2 () 8% - Tm = by 
(d) | 2 I -4 6x, + 5x. = bs 
-4 3 —2y, + 4m - 63 = db 
5 5 (g) 3y + 6% + 73 = by 
see Gi = Bi + Om = by 
-1 2 -2 x + Ty + 2 = db 
=9 5 0 8x + 8y + 62 = by 
(h) 4 6 Tz = b 
(f) 5 > —4 3 —4X + yor ZZ = 3 
1 1 1 -Ix + 2y - 5z = bh 
1 -4 =3 3. Show that the homogeneous system has only one solu- 
0 =9 =) tion, the trivial solution. 
(2) | 1 |] 4 3 i 4 
-2 -1 3 (@) } 4 _5 |v =9 [S}-297 
ol 5 0 6 5 |i 0 
of 2h a] 2 OLS alle elo 
5 |} -3 |] 5 


-4 -1 
: () 


0 

[a l[2 

2. Show that the linear system has at most one solution for 2 0 
any values by, b, b3, by. 


5 1 
4x + lly = By o| 3 3 js 


(a) 


5x + I2y = dy 2 -2 

—X| oF X2 = b, 5 6 1 0 
(b) -8x1 + Tm = by (e) 6 -7 -—2 |x=]| 0 

5x] + x2 = b -l1 -4 -l 0 
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=4 =2- <1 |f 4 
of 3 3 « |] |= 
2 f 45 | xy 


& 2 tp 
5 0 1 1 

(g) 1-4 4 | V2 |- 0 [S]-297 
5 7 4 {bt %3 
0 1 4 0 
=A 3. 5 0 

ee ee eS es (er 
= «1 § 0 


4. Show that the functions are linearly independent. 
(a) 1+4,t+f,and14+?f 
(b) sin? t and cos? t [S]-297 
(c) e* and e?* 

5. Show that the functions are linearly dependent. 
(a) 14+ 3t-27,-9-23r+ 217, and1+7t+7 
(b) 1, sin’ t, and cos? t 


(c) sin’ t, cos? t, and cos(2) [S]-298 


6. Do the columns of the matrix form a linearly independent 
set? 


2 2 
@ |, | 
3: 9 
(b) | 11 
6 7 
“8 <5 <7 
© | 5-9 al 
=1 +8 =10 
of -8 6 | 
9 +4 40 


1 -1 5 4 
02 ait. 2 | 


2 —-2 -9 
0 -ll -8 
9 6 -5 
1 -7 8 


-24 4 -93 -68 21 15 [S}- 
70 -89 78 26 -78 0O 
-46 68 -87 -88 -39 67 


-54 -30 -96 9 6 -74 
(g) 
298 


7. Determine whether the set is linearly independent. 


{-10r +P —5P,2t+ 67 — 2P, 


(a) 311+ 1485} 
‘ss {{ 1 8 —-ll Me, oh sie 


ieee ere real 


8. Is the empty set linearly independent or linearly depen- 
dent? [A]-350 


9. A 9 xX 6 matrix has linearly independent columns. How 
many pivot positions does it have? 


10. A 7 xX 6 matrix has 5 pivot positions. What can you say 
about the linear independence of its columns? 


11. Give an example of a 3 x 2 matrix M such that Mv = 0 
(a) has only the trivial solution 
(b) has a nontrivial solution 
12. What are the possible reduced row echelon forms of a 
4x 3 matrix with [A]-350 
(a) linearly independent columns? 
(b) linearly dependent columns? 
13. What are the possible row echelon forms of a 22 matrix 
with 
(a) linearly independent columns? 
(b) linearly dependent columns? 
14. If M is an m X n matrix with linearly independent 
columns, what can you say about the relationship be- 


tween m and n? HINT: Can a matrix with linearly in- 
dependent columns have more columns than rows? 


15. Find the value(s) of x for which the matrix has linearly 
independent columns. 


2 -6 
(a) | 3 x | 
4 
(b) 0 7 | [A]-350 


1 8 0 
ol « x 1 


1 8 2 

(d) 6 45 1 

—-3 -20 x 

16. Find the value(s) of x for which the matrix has linearly 
dependent columns. 


[A]-350 


3 x 
@| 3 4 
(b) | a | [A]-350 
x 5 
x -6 27 
(c) | —2 8 -30 | 
-l 5 -18 
1 5 -4 
(d) | -5 3 x [A]-350 
-7 -ll 4 


17. Find a nontrivial solution of Mv = 0 using the fact that 
the first and second columns of M are identical. Do not 
use row reduction. 


94 94 85 

-97 -97 83 

cs 6 6 24 
5 5 -77 
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18. Argue that the statements are equivalent. [S]-298 
(a) x=8 
(b) x is a perfect cube between 6 and 20. 

19. Argue that the statements are equivalent. 


(a) The graph of f is a line with slope 3 and y-intercept 
=). 

(b) f(x) =3x—-5. 

(c) f is a first degree polynomial passing through 
(—10, —35) and (10, 25). 


HINT: This requires three separate arguments. 


For exercises 20 and 21 assume that addition and scalar mul- 
tiplication are defined on the objects and that there exists an 
additive identity. 


20. Argue that if a set of objects is linearly independent then 
none of the objects is a linear combination of the others. 
HINT: Try proof by contraposition with multiple cases. 
Suppose one of the objects in the set is a linear combina- 
tion of the others, and logically conclude that the set is 
linearly dependent. 


21. Argue that if none of the objects of a set is a linear com- 
bination of the others, then the set is linearly indepen- 
dent. HINT: Try proof by contraposition. Suppose the 
set is linearly dependent, and logically conclude that one 
of the objects in the set is a linear combination of the 
others. 


The key ingredient in the proof of the uniqueness of reduced 
row echelon form (crumpet 23 on page 160) is that row oper- 
ations do not affect the linear dependence relationships among 
the columns of a matrix. To illustrate the claim, three matrices 
are given in exercises 22 and 23—M; F, a row echelon form of 
M; and R, the reduced row echelon form of M. 


22. Linear dependence is maintained. (i) Find a nontrivial 
linear combination of (some of) the columns of FR that 
sums to 0. (ii) Check that the same linear combination of 
columns of F' sums to 0. (iti) Check that the same linear 
combination of columns of M sums to 0. 


1 0 -11/8 O 


_|o 1 98 0 
Bo Ge ag A 
00 O 0 

6 F oo 4 

0 -8 -9 10 
eS). 6 DF 
00 0 0 


0 120 135-136 
-45  -279 —-252 289 


ian 56 63 63 
9 243 261 268 
1 -7/9 0 0 
0 oo 10 
Wa GE so OA 
0 0 00 
-9 7 -l1 -ll 
0 0 10 -3 
lac’ ee ee 
0 0 0 0 


180-140 950 -20 
-108 84 -512 -ll 
M=! 279 36 =358. o | SO? 


-189 147 -971 12 


23. Linear independence is maintained. (i) Find two different 


pairs of linearly independent columns of R. (ii) Check 
that the same two pairs of columns of F are linearly inde- 
pendent. (iii) Check that the same two pairs of columns 
of M are linearly independent. 


i. <6/1t O Std 
0 oO 1 4 
Gk=)G og @ 6 
0 0 0 0 
11 <6 —2 <5 
6 0 3 2 
I=) Goo © © 
0 0 0 0 
-33 18 18 63 
Pe ae a 
=| 1056 576 —204 528 
253. -138 -49 —127 
1 -1 0 -89/56 
_|o ® i —=17 
R=) Go OO 
00 0 0 
-§ 8 12 11 
O Ue 7 at 
Fe 6 a OO 
0 0 0 0 


-64 64 166 78 
-264 264 543 342 
M=! _39) 392 826 505 | [Al390 


-136 136 288 175 
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3.4 Characterization of m x n Matrices 


Theorem 5 of section 3.3 has a counterpart phrased in terms of the rows of M. Parts (i), (ii), (iii), and (vii) of the 
following theorem can be justified through reference to the previous, but parts (iv), (v), and (vi) cannot. 


Theorem 6. [Characterization of Matrices Part 2] Suppose M is an m X n matrix, v and c have n entries, and b 
and w have m entries. Then the following are equivalent. 


(i) The rows of M are linearly independent. 
(ii) No row of M is a linear combination of the others. 
(iii) w' M = 0° has only the trivial solution. 
(iv) M has a pivot position in every row. 
(v) There is a matrix R such that MR = I. 
(vi) Mv =b has at least one solution for every b. 
(vii) w' M = c! has no free variables. 


The justification for this theorem proceeds by logically connecting the statements according to the following diagram. 


(vii) 
ZL \ 

i) > Wi) => (ii 

U tt 


(iv) > Wi) > WW) 


Though the first several implications can be proven without reference to theorem 5, such reference will be used to 
emphasize the direct connection between the two theorems. 


(i) > (ii) The rows of M are linearly independent, so the columns of M7? are linearly independent. By theorem 5, 
none of the columns of M’ can be written as a linear combination of the others, so none of the rows of M can 
be writen as a linear combination of the others. 


(ii) > (iii) No row of M can be written as a linear combination of the others, so no column of M7 can be written as 
a linear combination of the others. By theorem 5, M7 w = 0 has only the trivial solution. Since M7 w = 0 is 
equivalent to (M7 w)’ = 07 (transpose both sides), which is equivalent to w! M = 07 (simplifying the lefthand 
side), the conclusion follows. 


(iii) > (vii) Since w’ M = 07 has only the trivial solution, the equivalent equations (w7 M)’ = (07)" and M’w = 0 
have only the trivial solution. By theorem 5, M7 w = c has no free variables. Therefore, the equivalent equation 
w! M = c’ has no free variables. 


(vii) > (i) Since w’M = c’ has no free variables, the equivalent equation M’w = c has no free variables. By 
theorem 5, the columns of M” are linearly independent. Therefore, the rows of M are linearly independent. 


(i) => (iv) Suppose M does not have a pivot in every row. Then any row echelon form of M has a row of zeros. Since 
that row of zeros is the result of a nontrivial linear combination of the rows of M, there is a nontrivial linear 
combination of the rows of M that sum to 0”. Therefore the rows of M are not linearly independent. 


(iv) => (vi) Since M has a pivot in every row, no row echelon form of M has a row of zeros. Therefore [ M b | 
cannot have a pivot in the rightmost column, and by theorem | the system Mv = b is consistent (has at least 
one solution) for any b. 


(vi) > (v) Let bj = Unxm).,j for j = 1,2,...,m. Since Mv = b has a solution for every b, there is a vector v; such 
that My; = b;. Letting R = [ Vi Vo +: Vin | we have MR = I. 
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(v) = (iii) Suppose there is a matrix R such that MR = I. If w’M = 0’, the following equations are deduced by 
matrix algebra. 


(w’ M)R=0'R 
w! (MR) = 07 
wl =0' 
w =0' 


Hence w’ = 07 (w = 0) is the only solution of w’ M = 07. 


The justification for many of the upcoming claims relies heavily on induction (see axiom 5 of crumpet 16 on page 
72). The principle behind induction is to show that (i) the claim is actually true for some particular integer, and (ii) 
if the claim is true for some integer at least as large, then it is also true for the successive integer. This way, part (i) 
establishes the claim for a particular integer, say k. Then part (ii) establishes the claim for the successive integer, k+ 1. 
Part (ii) also establishes the claim for the successor to the successive integer, k+2. Applying part (ii) again establishes 
the claim for the k + 3, and so on, part (ii) establishing the claim for all integers greater than k. Induction is often the 
most practical way to show that a statement is true for all integers and is particularly useful in proving claims about 
matrices of size n (for all n). Proofs of this nature will be shown, but this is not a course on proof technique, so it is 
up to you or your instructor to decide how deeply you need to understand these proofs. Even if you are not prepared 
to write your own induction proof or fully understand one, reading them is a good way to get a feel for the technique. 

An upper triangular matrix is one in which all entries below the main diagonal are zero, and a lower triangular 
matrix is one in which all entries above the main diagonal are zero. Using * to represent any number (as in the 
notation of section 2.3), a square upper triangular matrix looks like 


> nn a ae, Se? 
O kK kK wk tt Ok 
0 O wk we ts Ok 
0 0 0 ® +s * 

: * 
0 0 0 0 x 

and a square lower triangular matrix looks like 

x 0 0 0 0 
x * 0 0 0 
x k * 0 0 
kK ok ok Ok 0 

 & 0 
xk ke ke * 


All the nonzero entries are above or on the main diagonal (the upper triangle) for an upper triangular matrix, and all 
the nonzero entries are below or on the main diagonal (the lower triangle) for a lower triangular matrix. 

The determinant of a lower triangular matrix is the product of the entries on its diagonal. Now is a good time to 
work out a couple examples on your own and think about why this is true in general. We will prove it by induction, 
but the proof may not resonate with you the way your own thoughts about it will. 


Claim. If L is a lower triangular n X n matrix, then det L = Ly) )Llo2°++ Lin. 


Proof. If Lisa 1x1 matrix, it is upper triangular and det L = det ([Z,,:]) = L1,:. This establishes part (i) of the proof. 
The claim is true for the particular value n = 1. Now we assume that the claim is true for some (arbitrary) value n = k 
greater than or equal to one. That is, if ZL is a lower triangular k x k matrix and k > 1, then det L = L,)Ln2--- Lyx. To 
complete the proof, we must use this information to prove that if L is a (k + 1) x (k + 1) matrix, the next size up, then 
det L = L, L202 +++ Lys1441. To that end suppose L is a (k + 1) x (kK + 1) matrix. By definition, 


det L = (-1)'*"LZ,) det Ly) + (-1)!7L12 det Ly2 °° + (-1)'7L,3 det Ly 3 (3.4.1) 
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Since L is lower triangular, L;,; = 0 whenever j > 1. Therefore, all the terms of the determinant are zero except the 


first one. The determinant simplifies to 
det L = Lia det L\1,1. (3.4.2) 


But L\;, is a k X k matrix, so its determinant is the product of the entries on its diagonal (this is our induc- 
tive hypothesis). So detL\j; = Lo2L33--+Lysize1. Substituting this expression into (3.4.2), we have detL = 
Ly 1122133 +++ Ly+1441, and the proof is complete. Oo 


Exercises |7 through 19 request a proof that the determinant of a square upper triangular matrix is also the product 
of the entries on the main diagonal. 
Key Concepts 
characterization of matrices see theorem 6. 
upper triangular matrix a matrix in which all entries below the main diagonal are zero. 
lower triangular matrix a matrix in which all entries above the main diagonal are zero. 
determinant of a (square) lower triangular matrix the product of the entries on the main diagonal. 
determinant of a (square) upper triangular matrix the product of the entries on the main diagonal. 


proof by induction showing that (i) the claim is true for some particular integer, and (ii) if the claim is true for some 
integer at least as large, then it is also true for the next integer. These together prove that the claim is true for 
all integers greater than or equal to the particular integer of part (i). 


Exercises (b) 2! 7 8x + 5x3 = dy 
x, + 7x. + x3 = bo 
1. The size and number of pivot positions of a matrix M are 


given. Answer the following questions as completely as iS We i Dy 
you can. (i) Are the rows of M linearly independent? (ii) (c) ™M + 2v2 = by 
Are the columns of M linearly independent? (iii) How -Sy - vw +t 3 = Os 
many solutions does Mv = 0 have? (iv) How many solu- w + 6x - 6y + 52 = Db, 
tions does Mv = b have for arbitrary b? (d) 5w + 7x + 3y + 2g = Dy 
(a) 5X 8:5 Iwo + 2x - y = b 
(b) 3x3;2 5. Show that the homogeneous system has only one solu- 
(c) 7X7;7 tion, the trivial solution. 
(d) 9x 6;6 [A]-351 lay -¥ 7 | ss? 


2. The size of a matrix M is given. (i) What is the maxi- 2 § 2 
mum number of pivot positions M could have? Assume (b) [ X}  X2 | 5-8 0 | = [ 0 0 | 
it has that maximum number and answer the following 
questions. (11) Are the rows of M linearly independent? 5 6 1 
(iii) Are the columns of M linearly independent? (iv) (c) ‘| 6 -7 -2 | = [ 0 0 0 | 
How many solutions does Mv = 0 have? (v) How many -1 -4 -l 


solutions does Mv = b have for arbitrary b? 6 5 1 5 
(a) 13x5 @[v vm vw |} 3 0 -4 7 |=0" [s}- 
-1 1 1 -4 


(b) 12x 12 300 


29 [A]-351 
hee) aes 6. Do the rows of the matrix form a linearly independent 


3. Redo question 2 parts (ii)-(v) assuming M has less than set? 
the maximum number of pivot positions. [A]-351 12 21 
4. Show that the linear system has at least one solution for (a) | -16 -28 | 
any values b,, b, b3. 3 6 
(a) 4x + Illy = Dy (b) | 4 11 
Y 5x + Wy = by 6 7 
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3.6 -12 
© | 3 2 =| 
-1 -3 -10 
@|-7 9 5 
2 -5 0 
-18 1 -1 7 
(| -6 2 6 3 | [s}-299 
12 -5 0 4 
5 2 -3 
-4 -6 7 
Oi Gas 
0 -12 -11 


-38 -5 30 25 —-44 
4-28 44 -39 43 

a1 42 1 <1 99 

()) 47 12 26 -10 -8 
-13 -15 39 -9 22 

9 -3 3 -41 49 


[S]-299 


7. A5xX8 matrix has linearly independent rows. How many 
pivot positions does it have? [A]-351 


8. A 6x7 matrix has 5 pivot positions. What can you say 
about the linear indpendence of its rows? 


9. Give an example of a 2x3 matrix M such that v’ M = 07 
(a) has only the trivial solution 


(b) has a nontrivial solution 


10. What are the possible reduced row echelon forms of a 
3 x 4 matrix with [A]-351 


(a) linearly independent rows? 
(b) linearly dependent rows? 
11. What are the possible row echelon forms of a 22 matrix 
with 
(a) linearly independent rows? 


(b) linearly dependent rows? 


Compare your answer with the answer to section 3.3 ex- 
ercise 13. 


12. Find the determinant. 


2 0 
@) -137 -3 | 


= th! «6 
(b) | -15 4 0 | [A}-351 
25g) 27. aq 
-4 0 0 0 
5: =5. @: @ 
©} 3 4 3 0 
27 «37 «41 5 


13. If M is an m Xn matrix with linearly independent rows, 
what can you say about the relationship between m and 
n? HINT: Can a matrix with linearly independent rows 
have more rows than columns? 


14. Find the value(s) of x for which the matrix has linearly 
independent rows. 


2 -6 
@|3 3 | 
(b) | i : | [A]-351 


1 8 0 
(c) | 6 x 1 | 
ae ee 


1 8 2 
(d) 6 45 1 | [A]-351 
-3 -20 x 


Compare your answer with the answer to section 3.3 ex- 
ercise 15. 


15. Find the value(s) of x for which the matrix has linearly 
dependent rows. 


3 x 
(a) -5 4 | 
(b) 2 | [A]-351 
x 5 
x -6 27 
(c) | —2 8 -30 | 
-1 5 -18 
1 5 -4 
(d) | —5 3 x [A]-351 
-7 -ll 4 


Compare your answer with the answer to section 3.3 ex- 
ercise 16. 


16. Find a nontrivial solution of v’ M = 07 using the fact that 
the first and second rows of M are identical. Do not use 
row operations. 


-14 -29 49 —-32 
M=; -14 -29 49 —-32 
44 -25 13 -35 


17. Prove that if U is upper triangular, so is U\1,;.. [S]-299 


18. Prove that if U is upper triangular, then U\;,; has a zero 
on its main diagonal whenever j > 1. [S]-299 


19. Prove that if U is an upper triangular n x n matrix, then 
det U = U,1U22--+ Un». HINT: Use the facts proven in 
exercises 17 and 18. [S]-299 
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3.5 The Determinant Revisited 


Pick a number. Any number. 

Add 6. 

Multiply (your new number) by 6. 
Subtract 9. 

Divide by 3. 

Subtract 9. 


Tell me your latest number, and Ill tell you your starting number. It’s half of your latest number! Putting the 
instructions into symbols, you are being asked to calculate Sex+6)-9 —9, which simplifies to 2x. Hence the number you 
start with, x, will always be half of what you end with! 


Matrices and row operations work in a similar fashion, and you can be the magician. I'll pick a matrix. . any 
matrix. . .and tell you what it looks like after some operation. After swapping the first two rows, my matrix is 


13. -11 -19 


23 12 22 
-16 19 -5 


What was my original matrix? Answer on page 105. 


How about another? After scaling the third row of my matrix by 2, my matrix is 
2 -13 -6 
-7 -9 21 |. 
-14 -18 24 


What was my original matrix? Answer on page 105. 


And one last one...after replacing the first row of my matrix by the first row plus three times the third, my matrix 


ll 9 -l 
3 5 -8 |. 


3 8 | 


iS 


What was my original matrix? Answer on page 105. 


In each case the row operation can be undone to recover the original matrix. This is exactly the concept of 
an inverse! The six elementary matrices corresponding to the six row operations (the row operations that gave the 
matrices above and the row operations used to recover the original matrices), in the order encountered are 


operation recovery 


fi 1 


ROWNDCOHCO 
nee 
a 

SeoroOK OHS 


pe cece 
oorocorcr 
orooreoeco 


0 
0 
0 
1 
0 
0 
1 
0 
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It must therefore be that each operation matrix is invertible and 
-1 


010 010 
iO Of =/1 09 
001 00 1 
io oT Tr 0 
010] =|0 10 
0 0 2 0 0 3 
to Sy Pat. Gs 
010! =|0 1 0 
@ 6 1 00 1 


Multiplying the operation matrix by the inverse matrix will verify the inverse pairs. Each elementary matrix has an 
inverse elementary matrix of the same type, and in the case of a row swap, the elementary matrix is its own inverse. 
Now notice a couple of things about the determinants: 


0 1 0 
1 0 0}=-1 
00 1 


100 i O-G) 
. ) O10 10 2 ole- 
joi eo 2) 4 
1 03 i O38 
6 ¢ OJet |G 2 @ Sa 
001 00 1 


Come to think of it, the determinant of a scale matrix will always be the scale factor! Can you justify this claim? 
Answer on page 105. Wait a minute! the determinant of a replace matrix is always 1. Can you justify this claim? 
Answer on page 105. 
Is the determinant of a swap matrix always —1? There is no such thing as a swap matrix with one row. There is 
only one swap matrix with two rows: 
0 1 
Yo 


and there are only three swap matrices with three rows: 
0 1 0 0 0 1 1 0 
1 0 0}],; 0 1 Of}, and} 0 O 
0 0 1 1 0 0 0 1 


It is easy enough to check that all four of these matrices have determinant —1, so maybe all swap matrices do have 
determinant —1. 


oro 


Crumpet 19: Binomial Coefficients 


n(n — 1) 


The number of swap matrices with n rows, n > 1, is 


100(99 - 
ad) = 4950 swap matrices with 100 rows. The formula an 


43 
. For example, there are = = 6 swap matrices 


: 1). ; 
with 4 rows, and there are ) is a special case of 


the “choose formula’, which is a formula for the number of ways to choose k objects from a set of n objects, with 
0 < k <n. This number is also known as a binomial coefficient and there are several notations for it. Common 
notations and the formula for “‘n choose k’” are 

n! 


n ! 
( jk iF pee OO Ds ray: 
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Proving it for all n x n swap matrices requires induction, but before we can do it cleanly, we need one more fact: if M 
is ann Xn matrix with n > 1, then 


det M = (-1)'*!M,, det My, + (-1)"*? Miz det M2 +--+ + (-1)"*"Mjp det Min 
= (-1)'*/M, ; det M\,; + (—1)?*/ Mp; det M2, +--+ + (-D"*/ My, j det M\n,j. (3.5.1) 


for any i from | through 7 or any j from | through n. This formula implies that determinants may be computed by 
expanding along any row or any column, not just the first row. To illustrate, 


expansion along row | expansion along row 3 
=) 4A =] -5 4 -1 
2 -2 3 2 -2 3 
5 -2 -3 5 -2 -3 
—2 3 2 3 2 -2 4 -1 =5) =) 4 
--5| 7 5 |-4]5 5-1] 4 -s| 4 a2 3 |-3 2 4 
= —5(6 + 6) — 4(—6 — 15) — 1(-4 + 10) = 5(12 — 2) + 2(-15 + 2) — 3(10 - 8) 
= -60 + 84-6 = 50- 26-6 
=18 = 18 
expansion along column 2 
— 4 =! 
2 —-2 3 
a —2 <3 
2 3 -5 -l -5 -l 
--4l 3 5 |-3| 5 all's 4 | 
= —4(-6 — 15) — 2(15 +5) + 2(-15 + 2) 
= 84-40 - 26 
= 18 


Formula (3.5.1) makes it relatively straightforward to prove that all swap matrices have determinant —1. We have 
already shown (assuming you did the calculation above) that the determinant of any 2 x 2 matrix (and there is only 
one of them) is —1. Proceeding by induction, assume that for some k > 2, the determinant of all k x k swap matrices 
is —1, and let S be a particular but arbitrary (kK + 1) x (k + 1) swap matrix where rows i and j have been swapped. 
Since k + 1 > 3, there must be a row of S that is not involved in the swap, say row ¢. Then expanding the determinant 
of S along row € yields 


det S = (-1)1S¢; det Syp1 + (-1)"7S py det S\eo ++ + (-DO*"'S ony det S\enst 
= (-1)*'8 pp det See 
because S¢. is the £" row of the identity matrix (not being involved in the swap), meaning Sz, = 0 whenever 


m # €, Since the swapped rows are both in S\¢¢, S\e¢¢ is a k x k swap matrix and the inductive hypothesis implies 
det S\¢¢ = —1. Therefore, 


det S = (-1)**S pp det S\ee = (IW)(-1) = -1, 


completing the proof. 


Crumpet 20: Proof of Formula (3.5.1) 


Proving formula (3.5.1) requires a bit of work. One way to do it is to prove that (1) there is only one function G taking 
n X n matrices as inputs and returning scalars as outputs with the following four properties and (ii) the expressions in 
formula (3.5.1) have these four properties. Thus, each expression must give the same result. 


1. GD) =1 


2. G(A) = 0 whenever A has two identical columns. 
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3. If A, B, and C are identical except in their k“ columns where C., = A., + B.,, then G(C) = G(A) + G(B). 
4. If A and B are identical except in their kK columns where A.x = cB.,, then G(A) = cG(B). 


To begin, suppose G is a function from n Xn matrices to scalars satisfying the four properties above. Then G also has 
the following two properties. 


5. G(A) = 0 whenever the columns of A are linearly dependent. Proof: Because the columns of A are lin- 
early dependent, one of them, say column k, can be written as a linear combination of the others. That is, 
A:~ = sy cjA;; for some constants c;. Then 


jtk 
G(A) = o(| A 1 A: x-1 Ys cjA sdf Asks A n 
Fk 

=) Giles AeA aes A.n | 
tk 

= DS cjG([ Aus Ae AN ea Alor A.» |) 
jtk 

=o 0-0 
jtk 


by applying properties 3, 4, and 2, respectively. 
6. G(B) = —G(A) whenever B is the result of swapping two columns of A. Proof: Suppose B is the result of 


swapping columns i and j of A, and without loss of generality, assume i < j. Then 


B= A. Vis Age A eon Ay |i 


Now let C = [ Jays) 000 ANG a eval Soo Aap yA G5 Alar, lF Then by repeated application of property 
3s 


G(C) = G(A) + G(B) 
i G(| rey Gee Fog ea hen pane ob }) 
fh G (| Ages = Are te Ae Ay }) 


and by property 2 the last two terms are zero as is G(C). Hence, 0 = G(A) + G(B), concluding the proof. 


To begin the induction proof that G is unique, note that G ([a]) = G(a[1]) = aG({1]) = aG() = a by properties 4 
and 1, so G is uniquely determined for 1 x 1 matrices. Now suppose G is unique for all (k — 1) x (k — 1) matrices 
for some k > 1, and let M be a particular but arbitrary k x k matrix. If the columns of M are linearly dependent, 
then property 5 implies G(M) = 0, so G(M) is uniquely determined. Now suppose the columns of M are linearly 
independent. By theorem 5, M has a pivot in every column. Since M is square, M has a pivot in every row. Therefore 
M has a nonzero entry in row k, say M,; # 0. Letting 


My j-1 
Me oes, 


My, j J 


Mx, j-+1 
Mz jn. - M. 


Mj Saif 
repeated application of properties 3 and 4 plus property 6 if needed implies G(A) = +G(M) depending on whether 


j = k. Either way, G(M) is uniquely determined if G(A) is. Note that A,. = [ 0 -- O M; i making it 
sufficient to show that H(B) defined on (k — 1) x (k — 1) matrices by 


= oily Bo Mi 4-1,) 
H(B) i | or" M,; 


is uniquely determined. But H(B) inherits properties 2, 3, and 4 from G, so it only remains to establish that H(/) = 1. 
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Then, by the inductive hypothesis, H is uniquely determined. To that end, 


oes TO Myx, 
mom ek, 


ll 
= 


concluding the proof that G is unique. 


To complete the proof of formula (3.5.1), it remains to show that each of the two expressions for det M has 
properties 1 through 4. This is because formula (1.5.1), the definition of determinant, is one of the expressions, so by 
uniqueness they all produce the same result (as the determinant). 


Proceeding by induction, note that det({a]) = a satisfies all four properties, so the determinant has all four 
properties on | x 1 matrices. Now suppose det M has all four properties on (€ — 1) x (€ — 1) matrices for some € > 1 
and consider, for any fixed i = 1,2,...¢, the formula 


det M = (-1)*'Mj, det M1 + (-1)"? Min det My +--+ (- 1)" Miz det Mie 


on ¢ x € matrices. 


det Ince = (-1)*'U,1 det Ki + (- 1)? Un det Ain +--+ + (HDT det Ki 
= (- 1"; det Mii 


since J; = 0 whenever j # 7. But Ai; = [e-1xe-1), So by the inductive hypothesis det /\;; = 1 and we have 
det exe = (-1)7F,(1) = (II) = 1. 


2. Suppose M is a particular but arbitrary x matrix with columns j and k identical, and without loss of generality 
assume j < k. Then 


det M = (-1)*'Mj,, det M\,. + (-1)"? Min det Mn +--+ + (-1)"*! Mp det Mir 
= (-1)*/M;; det My; + (-1)"* Miz det Mix 


since M\;, has two identical columns whenever m ¢ {j,k} and therefore det M\;,, = 0 by the inductive 
hypothesis. But because columns j and k of M are identical, M;; = Mix, so we can rewrite detM = 
(-li Mi: ; [det My, + (-D*7 det Mix. Now if we swap columns j and j + 1 of M\;,, and then columns 
j +1 and j + 2, and so on to column &k, a total of k — j — 1 swaps, the result is M\;,;, so by the induc- 
tive hypothesis det M;; = (-1)*-7"! det M\;x. Substituting into the latest expression for det M, we have 
det M = (-1)'¥M, ;[(-1)F! det Myx + (-1)/ det Myx] = (C1)! [det Myx — det Mix] = 0. 


3. Suppose A, B, and C are identical £ x € matrices except in their k”” columns where C., = A., + B.,. Observe 
that C\i, = Byix = Aviz, Ci; = Bij = Ajj , and detC\;; = det A\;; + det B\;,; (by the inductive hypothesis) for 
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j # k and all i. It then follows that 


det C = (-1)'*'C,; det Cy. + (-1)""? Cin det Cyn + >» + (-1)"** Cig det Cyie 
= (-1"Ciy det Cig + (DC, j det Cy; 
j#k 
= (-1)* (Aig + Biz) det Cuz + YC, [det Ay; + det By] 
j#k 
(-1)'**Aj4 det C\in + (-D* Bix det C\;x 
+ ViCD* [Ci det Ayy + Ci; det By,| 
jtk 
=Cly Ay deta, + (1) “8, dat B, 
+ YiCD* [Ajj det Ay; + Bi, det By,,] 
jtk 
€ 
= VCb [A,jdet Ay, + B,;det By,j] = detA + det B 
j= 


4. Suppose A and B are identical € x € matrices except in their k* columns where A:x = cB.,. Observe that 
Byix = Ayix, Bij = Aij , and detA\;; = cdet B\;; (by the inductive hypothesis) for j # k and all 7. It then 
follows that 


igi 


det A = (-1)"*'A;; det Ay + (-1)"7Ajo det Ayo +--+» + (-1)"**Aje det Aye 
= (-1*A;, det Aya + DCD A: det Ay, 
jtk 
= (-1)*cBiy det Byx + SCDMB, j(cdet By,,) 


tk 


=c|(-1)*Bi det Bix + )\(-1)'B, det By, 


j#k 


e 
= c CD Bi, det B\;,; = cdet B 


j=l 


Hence the determinant may be calculated by expansion along any row. 


As for expansion along any column, we begin by showing that the function f(M) = det M? (where det M is 
defined by row expansion) has all four properties for any size matrix, so must equal det M. Note that if Misa1 x1 
matrix, M’ = M. Therefore f(M) = det M7 = det M = M,,, so f(M) has all four properties. Observe that if M is an 
n Xn matrix and n > 1, then by definition of f and the row expansion formula for determinant, 


f(M) = )\(- 1) Mj det(M")\,; (3.5.2) 


j=l 
for any fixed i = 1,2,...n. We now proceed to show that f has all four properties for n x n matrices where n > 1. 
1. For any n, Fe, = Inxns 80 fUnxn) = det.) = detUnxn) = 1. 


2. Suppose M is a2 x 2 matrix with two identical columns. Then M = | | for some constants a, b, and 


a a 
b b 
f(M) = det M" = de(| : d |) =a-va=o. 


Now set n > 2, suppose f(A) = 0 for any (n — 1) x (n — 1) matrix A with two identical columns, and let M be 
an n Xn matrix with identical columns k and €. Because M has at least three columns, there is ani, 1 <i<n 
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such that i # k andi # €. Then, for this particular 7, 


FM) = Se DMj, det(M),,, 
j=l 
= CDM; det(My:)" 
j=l 
= VED Mf 
j=l 
By the inductive hypothesis, f(M\;;) = 0 for all j, so f(M) = 0. 


3. Letn > 1 and suppose A, B, and C are identical nxn matrices except in their k columns where C., = A.,+B.,. 
Then (C7); = (A’)\x,; = (B")\,; for all j = 1,...,m because A’, B”, and C? are all identical except in their 
k" rows. Now, applying (100) with i = k, 


FC) = DCC det(C7)y.,j 


j=l 


= Sep (Aju + Bix) det(C”),; 


j=l 


= VED (Ajedet(C ej + Bix det(C").)) 


fl 


= VCD! (Aja det(A")x,j + Bjx det(B")\.)) 


jl 


n e 
= VDA det(A ej + (KDB jy det(B yj 


jl Fl 
= f(A) + f(B) 
4, Let n > 1 and suppose A and B are identical n x n matrices except in their k’" columns where A., = cB.,. 
Observe that (A’);,; = (BT); for all j = 1,...,n because A? and BY are identical except in their k” rows. 
Now applying (100) with i = k, 


n 


fA) = DCD Aj deta); 


= 


= Cpes jx det(B) yj 
j=l 


= 6 HDB det(B ye, 


= 


= cf(B) 
Finally, the expression 
(-1)!M, ; det M\,,; + (-1)?* Mp; det My, ; +--+ + (-1)"M,,,; det M\y,; 


from (3.5.1) equals 


Sepa, jdet My, = Sie 1y'MT, det(M™,)" 
i=1 i=l 


\ji 


= Sev My, det MZ 
eal 


= det M7 
= det M 
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by the row expansion formula and the fact that det M7 = det M. 


This proof is an adaptation of the presentation in sections 6.1 and 6.2 of [17]. 


To recap, 
e the determinant of a swap matrix is —1, 
e the determinant of a scale matrix is the scale factor, and 
e the determinant of a replace matrix is 1. 

Interestingly, within the proof of (3.5.1) lie the proofs that 
e if Bis the result of swapping two columns of a square matrix A, then det B = — det A, and 
e if Bis the result of scaling a column of a square matrix A by c, then det B = c det A, and 
e for any square matrix M, det M? = det M. 

Putting these three facts together, it is easy to justify the following two facts. 


e If Bis the result of swapping two rows of a square matrix A, then det B = — det A. 


[det B = det B’ = det A? = — det A since B’ is the result of swapping two columns of A’ and det M7 = det M.] 


e If Bis the result of scaling a row of a square matrix A by c, then det B = c det A. 


Can you justify this? Answer on page 105. 


The relevance of all these observations is mounting evidence that det(EA) = det E - detA for any square matrix A 
and elementary matrix EF. This will be an important point soon enough. We already have that det(EA) = —detA = 
det E - detA when E is a swap matrix and det(EA) = cdetA = det E - detA when E is a scale matrix. We are only 
missing this fact for elementary replacement matrices. 

The proof of (3.5.1) does not provide direct proof that if B is the result of a row replacement in a square matrix A, 
then det B = det A, but it provides the right tools for the job. Beside the facts already noted, we learn from the proof 
that 


e if A, B, and C are identical square matrices except in one column, say the kt, where C ik = Azz + Bx, then 
det(C) = det A + det B, and 


e if C is a square matrix with two identical columns, then det C = 0. 


As we just encountered, statements about the columns of a matrix and its determinant can generally be restated in 
terms of rows since det M7 = det M. It is safe to conclude that 


e if A, B, and C are identical square matrices except in one row, say the kh, where Cy: = Ag; + Be:, then 
det(C) = det A + det B, and 


e if C is a square matrix with two identical rows, then det C = 0. 
Justifications are requested in exercises 15 and 16. These two facts plus the fact that if B is the result of scaling a row 


of a square matrix A by c, then det B = cdet A make it a straighforward matter to prove that if B is the result of a row 
replacement in a square matrix A, then det B = det A. 
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Proof. Let A be ann X n matrix and suppose B is the result of adding c times the j“"row to the k"” row of A, j # k. 
Then 


Al; Al, Al; Al; 
Ax-1;; Ax-1,; Ag-1;: Ag-1;; 
detB=|] Ay. +cA;j; |=| Ag: |+] cAj; |=detA+c} Aj. | =detA+0=detA. 
Agel Agel Agi Agi 
An: An: An: An: 


Therefore, det(EA) = 1 detA = det E - detA when E is a replacement matrix. 


Key Concepts 
elementary matrices are invertible and det(ZA) = det E - det A for any square matrix A and elementary matrix E. 


determinant by expansion the determinant of an n x n matrix M may be calculated by expansion along any row or 


any column: 
det M = (-1)"*!M,, det My, + (-1)"*?Mj2 det Min + +++ + (-1)"*"M;,, det Mn 
= (-1)' My ; det Mj + (-1)°*Y M2; det M\z,; + + + (-1)""/ My, j det M\n,j 
foranyi=1,...,norany j=1,...,n. 


determinant of replacement matrix if E is an elementary replacement matrix, det E = 1. 
determinant of swap matrix if E is an elementary swap matrix, det E = —1. 
deteminant of scale matrix if E is an elementary scale matrix, det E = s where s is the scale factor. 


determinant of the transpose for any square matrix A, det A? = det A. 


Exercises 2. Find the determinant of the triangular matrix. 
1. Take advantage of the fact that the determinant may be 6 0 0 
expanded along any row or any column to compute the (a)} -1 -1 0 
determinant. -l 3 2 
0 -2 0 9 : : 7 
-4 0 1 0 (b) = 
(a) 4 90 2 [S]-300 00 5 
0 O00 0 4 2 0 0 0 
0 -5 0 -2 @|— 8 © © | ts1-300 
07 0 0 1 8 4 0 
(b) 4 30 0 —2 —2 8 «7 
2 0 2 -3 2 7 7 6 
0 1 3 =-1 
03 03 i ae 
0 0 O -!1 
CEs a6 a3 
0-9 O 0 3. Use the fact that 
0 9 3 -2 pee 
8 -2 2 /|=32 
(d) 0 -2 0 O 8 0 8 
-6 -4 0 O 
0 4 2 0 to compute the determinant of [S]-300 
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0 1 0;f 2 0 0 1 6 5 
(ay | 1 0 0 8 2 2 (a)| 1 -1 8 
00 1}, 8 0 8 5 
10 0 2 0 0 7 0 5 
(vb) | 0 1 O 8 -2 2 (bo) | 1 -1 8 
00 4 8 0 8 1 6 5 
1 0 0 —2 0 0 1 6 5 
(c)} 0 1 - 8 -2 2 (c)| 1 -1 8 
00 1 8 0 8 9 -2 21 
3 0 0 1 0 0 —2 O 0 1 -1 8 
(d)| 0 1 0 0 1.0 8 -2 2 Gr a aS 
001 18 0 1 8 0 8 7 0 5 
4. Use the fact that 7. Use the fact that 
5 3 7 6 -2 1 
“1 4 3/=14 6 5 -2|=455 


to compute the determinant of to compute the determinant of 


1 0 0 5 3 7 . 4 
(ay/0 0 14) -1 4 3 @|—2 5 7 
0 1 0 2 6 8 ; oe 
wfo3 spi 23 sr 
(b) 7 (b) | -2 7 5 
00 1 2 6 8 ha 
1 0 0 5 3 7 
6 6 7 1 0 0 
oe = off : ‘| @:| 62 5 7] e 1 o 
a ee 1 —-2 8|f[0o0 1 
00 1 1 -2 0 5 3 7 1 
6 6 7 x 00 
ofereleretasi] o/s 5 ates 
1 2 8 0 01 
USE Hie itac eat 8. Use the facts that 
4 4 -1 
1 7 3 |=38 det (E3E EA) = | 
-1 3 4 
and 


to compute the determinant of [A]-351 (a) E, is a swap matrix 


4 4 -1 (b) E> is ascale matrix with scale factor —2. 
(a) | -1 3 4 ; : 
1 7 3 (c) E3 1s areplacement matrix. 
4 4 -] to determine det A. [A]-351 
(b) | 1 7 3 | 9. Use the facts that 
7 ; 5 det (E,E3E,E,A) = 1 
4 4 -] 4h3 koh) 
(c) 0 10 7 and 
-1 3 #4 ; eli 
(a) E, is ascale matrix with scale factor 2. 
5 35 15 ; se 
(d) i 4 -] (b) E> is a scale matrix with scale factor 3. 
-1 3 4 (c) E3 1s areplacement matrix. 
6. Use the fact that (d) E4 is a scale matrix with scale factor x. 
1 6 5 to determine det A. 
1 -1 8 |= 336 10. Let A be a4 x 4 matrix with det A = 3. Find det(2A). 
7 0 5 
11. Let M be an n x n matrix with detM = 4. Find 


to compute the determinant of det(7M). [A]-351 
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12. Let A be a square matrix and E a scale matrix with scale 15. Use the fact that if A, B, and C are identical square matri- 
factor z. Find det(E°A). That is, det(EEEA). ces except in one column, the kth, where C., = A-, + Bix, 
then det(C) = det A + det B to prove that if A, B, and C 
are identical square matrices except in one row, say the 
k", where C;,, = Ay, + By, then det(C) = det A + det B. 


13. Suppose A is a square matrix, EF is a swap matrix, and 
det(EA) = 33. Find 


(a) detA 16. Use the fact that if C is a square matrix with two identi- 
(b) det E cal columns, then det C = 0 to prove that if C is a square 
matrix with two identical rows, then det C = 0. 

(c) det A? ; ; 
17. Prove that the determinant of a square upper triangular 
14. Suppose M is a square matrix, E is a replacement matrix, matrix is the product of the entries on its diagonal by 

and det(EM) = -5. Find det(M"). [A]-351 expanding along the first column. 
Answers 


what is my matrix (swap)? The original matrix can be recovered by swapping the first two rows back: 


13. -11 -19 
23 12 22 
-16 19 -5 


what is my matrix (scale)? The original matrix can be recoverd by scaling the third row by 5 (the multiplicative 
inverse of 2): 


2 -13 -6 
-7 -9 21 
-7 -9 12 


what is my matrix (replace)? The original matrix can be recoverd by replacing the first row by the first row plus 
negative three (the additive inverse of 3) times the third: 


2 -15 -4 
3 5 -8 
3. 8 1 


determinant of a scale matrix A scale matrix is lower triangular with ones on the diagonal everywhere except the 
row that it scales, where the entry equals the scale factor. Since its determinant is the product of the entries on 
its diagonal, the determinant equals the scale factor. 


determinant of a replace matrix A replace matrix is either lower triangular or upper triangular with ones on the 
main diagonal. Therefore its determinant is one. NOTE: This argument uses the fact in exercise 19 of section 
3.4. 


determinant of a scaled matrix det B = det B’ = det A’ = cdetA since B’ is the result of scaling a column of A’ 
and det M7 = det M. 
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3.6 Characterization of Square Matrices 


For square matrices, all fourteen of the statements in theorems 5 and 6 are equivalent. A square matrix with a pivot 
in every row has a pivot in every column, and vice versa—end of justification. Square matrices have an additional 
property to discuss, though—invertibility. It turns out that, for a square matrix, the conditions in theorems 5 and 6 
plus a couple that don’t appear in those theorems are equivalent to invertibility. Consider the following. 


1. M has a pivot position in every row and column. 
det M # 0. 


M can be row reduced to /. 


Po eS 


M is invertible. 


You may or may not have considered these statements equivalent up to this point, and there is no harm done either 
way. It turns out they are equivalent to one another and equivalent to the statements in theorems 5 and 6. All 
this will be summarized in one last matrix characterization theorem, justified by the following narrative that shows 
(1) => (2) = (3) = &) = (thm 5) and (thm 6) => (1). Until the statement of the theorem, where this information 
will be repeated, assume that M is an n X n matrix. 

Suppose M has a pivot position in every row and every column. Record the elementary row operations, and 
more importantly the corresponding elementary matrices, F), F>,...,£,, that reduce M to any row echelon 
form R. Then E,---E,E,M = R where R is in row echelon form. Because all elementary matrices are invertible, 
M = E;'E;'---E,'R and therefore det M = det(E;'E;'---E;'R) = det E;! - det E;'--- det E;' - det R (a result of 
section 3.5). Because the inverse of an elementary matrix is an elementary matrix itself and all elementary matrices 
have nonzero determinant, all the det E;} are nonzero. Because M has a pivot position in every row, R must be 
upper triangular with nonzero entries (the pivots) on the diagonal, making det R equal to the product of these nonzero 
entries. Hence det R # 0 and it follows that det M # 0. 

Suppose det M # 0. The reduced row echelon form, R, can be represented by R = E;---E,E,M for some 
elementary matrices FE), Fo,...,E,%. Because detR = det(E;,--:ExE,;M) = det E,---det Ey - det FE; - det M and 
det M + 0, det R must also have nonzero determinant. But the only reduced row echelon form of a square matrix with 
nonzero determinant is the identity (all others have a row of zeros, putting a zero on the main diagonal). Therefore M 
can be reduced to J. 

Supposing M can be reduced to J, we have E,:--E,E,M = I for some elementary matrices E), F>,..., Ex. 
Letting E = E,---E,E,, we have EM = J. But elementary matrices are invertible, so E is invertible and therefore 
M = E™'. Since E™! is invertible (with inverse E), M is invertible. 

Supposing M is invertible, let L = R = M™!, proving the existence of matrices L and R such that LM = ] = MR. 
By theorems 5 and 6, M has a pivot position in every row and column. 

Supposing there is a matrix R such that MR = I, v = Rb is a solution of Mv = b since M(Rb) = (MR)b = Jb = b. 
Hence Mv = b has at least one solution for each b, and by theorem 6 M has a pivot position in each row. Since M is 
square, M has a pivot position in each column as well. 


Crumpet 21: Proving Real Numbers are Equal 


Every so often, it is convenient to prove that two real numbers, x and y, are equal by showing both x < y and x > y. 
The only way x can be less than or equal to y and simultaneously greater than or equal to y is for x to equal y. This 
technique is implicitly used to justify part (ix) of theorem 7. Theorem 5 implies Mv = b has at most one solution 
(the number of solutions is less than or equal to one) and theorem 6 implies Mv = b has at least one solution (the 
number of solutions is greater than or equal to one). Together, then Mv = b has exactly one solution. 


We now have justification for the following theorem. 


Theorem 7. [Invertible Matrix Theorem] Suppose M is ann Xn matrix, and b and v have n entries. Then the 
following are equivalent. 
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(i) The columns of M are linearly independent. 
(ii) The rows of M are linearly independent. 
(iii) No column of M is a linear combination of the others. 
(iv) No row of M is a linear combination of the others. 
(v) Mv = 0 has only the trivial solution. 
(vi) M has a pivot position in every column. 
(vii) M has a pivot position in every row. 
(viii) Mv = b has no free variables. 
(ix) Mv =b has exactly one solution for every b. 
(x) M can be row reduced to I. 
(xi) There is a matrix L such that LM = I. 
(xii) There is a matrix R such that MR = I. 
(xiii) det M # 0. 
(xiv) M is invertible. 


This theorem gives 13 ways to detect whether a square matrix is invertible, impressive in itself. But we can also 
draw two separate, significant conclusions from all this. Parts (xi) and (xii) suggest we only need to check that AB = J 
or BA = I, not both as required by the definition, to conclude that B is the inverse of A. The theorem gives the other 
equality. Additionally, the bolded section of the justification, near the middle of page 106, provides an algorithm for 
calculating the determinant of a square matrix! Can you follow the instructions to compute the determinant of 


6 3 6 
—2 1 -1/? 
3 4 6 


Answer on page 109. 

If you concluded in exercise | | of section 1.5 that one row of a matrix could only be written as a linear combination 
of the others when the determinant of the matrix was zero, you were correct, and we finally have the theory to support 
it. 

Key Concepts 


characterization of invertible matrices see theorem 7. 


algorithm for computing the determinant reduce the matrix to row echelon form, noting the row operations used. 
The product of the determinants of the inverses of the associated elementary matrices with the determinant of 
the reduced matrix is the desired determinant. 


Exercises (b) Five row replacements and three row swaps. 


1. The row operations that reduce a matrix A to 


-5 15 -10 (c) Nine row replacements, a row scale by 6, and a row 
QO 12 -14 scale by 5. 
0 oO -2 


angen ind det, ram (d) Four row replacements, a row scale by 10, and two 


(a) Ten row replacements. row swaps. 
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The row operations that reduce a matrix A to 


—2 0 O O 
-Il -1 0 O 
13 10 5 O 
0 2 14 6 


are given. Find the possible values of det A. 


(a) Row replacements only. 
(b) Row replacements and row swaps only. [S]-301 
(c) Row replacements and row scales by 3, 10, and 14. 


(d) Row replacements, row swaps, and row scales. 
The row operations that reduce a matrix A to 
-21 -2 6 
0 0 -5 
0 0 -9 
are given. Find det A. 


(a) Five row replacements and three row swaps. 


(b) Four row replacements, two row swaps, and a row 
scaling by —5. 

(c) 36 row replacements, 13 row swaps, and scaling 
by 12, -13, and- 3. 


Row reduce to a triangular matrix to compute the deter- 
minant. 


3 10 
@) u-| 15-10 
-12 12 
(b) u =| a | [S]-301 
oF 
(©) u=| 4.7 | 
16 =3 <2 
(4) M=-| -8 4 -2 
8 1 2 
-10 12 -50 
(ec) M=| 20 -18 80 
-30 18 -80 
11 =15 4 
(f) M=| 8 9 -4 | [S}301 
= a ee 
3 90 -308 -6 
-3 -140 484 10 
GM=| 6 a0. <ar ae. | ere 
& 90 203) Jd 
-80 -161 -18 55 
80 154 27 -66 
io oad ane 0 9 -11 
294 =<49 (a7 3=99 


5. Compute the determinant using a judicious combination 


of row expansion, column expansion, and row reduction. 


6. 


14. 


15. 


=21 9 0 
| 9 | 
4 1 


-l 5 -l 
(b) | 0 6 -l | [S]-302 
1 


4 -10 2 
~ or a 

()}| -9 1 -1 
“92 4 <3 


=o. 24. 2 3 

oe a ee: 
Mis 40 -5 -113 

0 2 14 6 
-87 —25 
aa oa ou 
ce 
i =f m 3 


Is the matrix invertible? Explain. 


@| > i | 


a 9 
(b) 0 ~20 | 
1 ot 
() -5 -4 | 
i @ 3 
(d) | 03 1 | 
0 9 +14 
1 1 3 
(e) | 47 i | [S}-302 
0 3 20 
4 4 2 
() | -7 -16 “| 
8 5 -3 


. Suppose MM is not invertible yet there is a matrix R such 


that MR = I. How is this possible? 


. Suppose M is square and 3M.» = 2M.) — 8M.5 + 5M.o. 


What is det M? 


. Suppose the rows of M are linearly independent but M is 


not invertible. How can this be? 


. Explain why a matrix with a pivot position in every row 


and every column must be invertible. 


. Suppose G is square and Gv = b is inconsistent for 


some vector b. What can you say about solutions of 
Gv = 0? [A]-351 


. If Gis square and Gv = 0 has infinitely many solutions, 


what can you say about solutions of Gv = b? 


. If M is invertible, then the rows of M7 are linearly inde- 


pedent. Explain why. [A]-351 


If H is 7 x 7 and Hx = b is consistent for every b, how 
many pivot positions does H have? 


If a square matrix B cannot be reduced to the identity 
matrix, what can you say about [A]-351 

(a) its columns? 

(b) the equation Bv = 0? 
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(c) the equation AB = /? 
16. Describe the row echelon form of an invertible matrix. 


17. When the determinant of an n X n matrix is zero, (select 
all that apply) [A]-351 


(a) exactly one row is a linear combination of the oth- 
ers. 


(b) every row is a linear combination of the others. 


(c) each row after the first one is a linear combination 
of the rows above it. 


(d) any linear combination of the n rows sums to zero. 


Answers 


(e) at least one row is a linear combination of the oth- 
ers. 


(f) its inverse is the zero matrix. 


(g) it has no inverse. 


18. Recall that A, v is an eigenpair for M whenever v # 0 yet 
(M—-ADv = 0. Use theorem 7 to prove that the following 
statements are equivalent. 


(a) Ais an eigenvalue of M. 
(b) The rows of M — AJ are linearly dependent. 
(c) det(M — Al) = 0. 


determinant The instructions are, in brief: Record the elementary row operations, and more importantly [note] 


the corresponding elementary matrices, E,, E5,. 
det M = det Ej! - det E;'---det E;,' - detR. 


..,E,, that reduce M to any row echelon form. Then 


Recording the elementary row operations during row reduction: 


6 3 6 Mp, 3Mp, 6 3 6 Mp,->Mp;+M, 
—2 1 -!l — -6 3 -3 — 
3 4 M3:>-2Ms3, 6 -8 -12 M3,:>M3:+M, 
6 3 6 Mo, Mo, +M3; 6 6 M3:->M3,+5M> 6 3 6 
0 6 3 —s 0 -3 — 0 1 -3 
0 -5 -6 0 -6 0 0 -21 
6 3 6 6 3 6 
The determinant of | -—2 1 -1 | isthe determinant of} 0 1 -3 |, whichis 6-1-—21 = —126, times 
3 4 6 0 0 -21 


the determinants of the inverse elementary matrices: 


6 


= 21. 


3 1\ (1 
—2 1 -l = ra cnenann(-5)(5] 
4 


3 
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3.7. The Inverse Revisited 


As if we haven’t already extracted enough information from theorem 7, we also have the rather significant following 
theorem as a consequence. 


Theorem 8. [Determinant of a Product] /f A and B are n X n matrices, then det(AB) = det A - det B. 


Proof. First suppose AB is noninvertible. By theorem 106, det(AB) = 0. If both A and B are invertible, then 
(AB)(B'A7') = I, so AB is invertible. Therefore we must have that either A or B is noninvertible, from which it 
follows det A = 0 or det B = 0. Either way, det A - det B = 0 and we have shown det(AB) = det A - det B. Now suppose 
AB is invertible, and let M = (AB)"!. Then J = (AB)M = A(BM), so A7! = BM and A is invertible. As in the 
justification of 3.4. on page 106, we may therefore write A as a product of elementary matrices, ES + Ey ‘Ey ‘ 
Hence det(AB) = det(E;! --- E;'E,'B) = (det E;'--- det E;' det E;') det B = det A det B. Oo 


As a direct consequence, we can relate the determinants of inverse matrices. If M is invertible, then det M- det M “ls 
det J = 1 and therefore det M7! = =... 


There is more! The proof of theorem 7 also provides an algorithm for finding the inverse of a matrix. Given that 
M is invertible, it is reducible to the identity matrix, meaning there are elementary matrices FE, E2,..., E% such that 
E|, Eo,...,E,M = I. Therefore M7! = E) E> --- E;l, so the same sequence of elementary row operations that reduces 
M to the identity also transforms J into M~'! Hence, if we augment M with the identity matrix and reduce to reduced 
row echelon form, the augmented columns will hold M -! To illustrate, let 


3.0 5 =O 
5 1 0 2 
Me) 6.2 0 7 
0 0 -l -2 
Augmenting the identity and reducing, 
3 0 5 0 10 0 0 62 0 7 00 1 0 
5 1 0 2 01 0 O0}]Mce%:;1 5 1 0 2 0 1 0 0 My: >Mi.—Mo, 
62 0 7 00 1 0 3 0 5 0 1 0 0 0 
0 0 -l —2 00 0 1 0 0 -l -2 00 0 1 
11 0 5 0 -1 1 0 1 1 O 5 0 -!l 1 O 
5 1 0 2 0 1 =O O | %:>m,:-5m;} 0 -4 O -23 0 6 -5 O Ma, Ma. M3; 
30 5 0 1 0 0 0 | m.>ms,-3m,] 0 -3 5 -15 1 3 -3 0 
0 0 -l —2 0 0 0 1 0 O -l -2 0 0 0 1 
1 1 0O 5 0 -!l 1 O 1 1 O 5 0 -1 1 O 
0 -l -5 -8 -1 3 -2 0 M2, 9-1 Ma; 0 1 5 8 1-3 2 O M3, Ma;+3Mp, 
0 -3 5 -15 1 3 -3 0 0 -3 5 -15 1 3 -3 0 
0 0 -l —2 0 0 0 1 0 O -l -2 0 0 0 1 
11 0 5 0 -1 1 0 1 1 O 5 0 -1 1 O 
0 1 5 8 1 -3 2 O | M3:2M3:+20M4;] 0 1 5 8 1-3 2 0 MaMa; 
0 0 20 9 4 -6 3 0 0 0 0 -31 4 -6 3 20 
0 0 -l -2 0 0 0 1 0 0 -l -2 0 0 0 1 
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1 1 O 5 0 -1 1 O 1 1 0 °5 QO -l 1 0 
0 1 5 8 1 —3 2 O Mai 01 5 8 1 -3 2 0 Ma, M2. 5Ma,; 
0 0-1 —-2 0 0 0 1 Ms, =! Mg 0012 0 0 0 -1 M,;->M\,-Mp, 
ae 4 6 3 20 
0 0 O -31 4 -6 3 20 00 0 1 3, St 3] ai 
100 7 -1 2 -l -5 100 7 -1 2, -1 -5 
0 1 0 -2 1 -3 2 5 M3,.>M3,-2M4.| 0 1 O —-2 1 -3 2 5 
2 6 9 
oe 7 : : : ae: 31, “i 3, iho 
000 1 ~3r 31 313 000 1 ~ 31 31 ~3r 31 
io og os 2 2 2s 
M>:>M>:+2M4.1 0 1 0 O on ‘ 56. 113 
amie ee: 31 31 31 31 
M,7M,,-7™My,} 0 O 1 O i ~ x * 3 
4 6 3 20 
000 1 ~ 3] 31 37 3] 


so 
-3 20 -10 -15 
1} 23 -81 56 115 
~31/ 8 -12 6 9 
-4 6 -3 -20 


Crumpet 22: Inverses via Row Reduction 


We could have seen that inverses could be computed with the help of row reduction long ago. After all, if A is an 
n X n matrix and B is its inverse, then AB = J. By thinking of this product one column at a time, this means 


AB. = I, AB.» = Lo, ses AB. , = IES 


Solving these equations for the B.; could be done one at a time by row reduction. Putting the solutions together into 
a matrix would give B. Reducing A n times would be repetitive and time consuming, though. Better, the solutions 
could be found simultaneously by augmenting all of the /.; together—in effect, augmenting the identity matrix—and 
reducing once (the algorithm presented in this section). 


While this process is still tedious for large matrices, it certainly beats the alternative of using formula (1.6.2). 
Ironically the ideas presented recently give us the tools to finally prove that (1.6.2) correctly computes the inverse. 


Let M be ann Xn matrix and consider modifying M by replacing row j with a copy of row i, i # j. Call the modified 
matrix M. Then 


[| = 1 Mya [Mia] + (DI M2 |Myja] + + CDM in [Mj 
But Mix = Mj, and M\ jx = M\;x by construction, so 
|M| = (-D*! Mi |Myja] + (DP? Mia |[Myja| +0 + (HD Min |[M\ jal - 
On the other hand, |” | = 0 since M has two identical rows. We conclude that 
(-D Mi |My ji] + (CD? Mi 2 |My ja] 0 + (HD Min |[M ja C7.) 


equals 0 whenever i # j. Observe that when i = j, (3.7.1) is det M expanded along row i (or row j depending on your 
perspective). The proof of formula (1.6.2) then lies in noticing that for any square matrix A, the entries of the product 


A-adjA 
all take the form (3.7.1). Accordingly A - adjA = (det A)/ for any square matrix A. If A is invertible, detA # 0 and we 
have A - -tradjA = I, so A! = SoadjA. 


Another place where row reduction could help ease an earlier burden is finding eigenvectors. Unless you happened 
to work through exercise 5 of section 2.3, the last time you were asked to compute an eigenvector, you were expected 
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to write out a linear system of equations without using matrix notation and to solve the system using elimination 
or substitution, not row operations. With the introduction of the parametric vector form for writing solution sets of 
linear systems with infinitely many solutions, there is no reason not to apply matrix techniques to the task of finding 


eigenvectors. Can you use row reduction to find the eigenvectors of 


v[2e 


—21 53 


given that its eigenvalues are 4 and 32? Answer on page | 13. 


Key Concepts 


1 


determinant of an inverse if M is invertible, det M7! = =4,. 


determinant of a product if A and B are n x n matrices, then det(AB) = det A - det B. 


inverses by row reduction if A is invertible, then [ A I | row reduces to [ I A! |: 


eigenvectors by row reduction if 2 is an eigenvalue of M, then corresponding eigenvectors can be found by row 


reducing M — Al. 


Exercises 
1. Find the inverse by row reduction. 
4 
-3 10 
y) | -15 -10 | 
5 
7: 12 
(b) | 14 6 
6 
9 -7 
(c) a 4 | [S]-302 
q 
16 -3 -2 
(d)| -8 4 -2 | [S}-302 ‘ 
=§ 1 2 
-10 12 -50 
(ec) | 20 -18 80 
-30 18 -80 
=11 =15 4 
| 8 9 -4 
4 <3 2 9 
3 90 -308 -6 
( | 73 140 484 10 10. 
2) 6 210 -737 -16 
5. 90) 231 ad 11. 


2. If det M = 2 and detR = 4 and M and R are the same 


size, find [S]-303 
(a) det(MR’) 
(b) det(M-'!R) 
(c) det(MR™!)? 


3 


3. Suppose L,A,M,B are square matrices such that 
det(LA) = 6, det(AM) = 24, and det(MB) = 48. Find 


(a) det(LA’) 
(b) det(LM-') 


(c) det(LAMB) 
(d) det(LB) 


. For a square matrix M, explain why the determinant of 


M’ M must be nonnegative. 


. Suppose M is invertible. Explain why PMP! is invert- 


ible for any (invertible) matrix P. 


. Support the claim that the product of invertible matrices 


is invertible. 


. Explain why det(PMP7!) = det M for any matrices M 


and P, assuming both sides of the equation are defined. 


. Suppose [A]-351 


(a) Is A necessarily invertible? 


(b) If A is square, is A necessarily invertible? 


. If A is an eigenvalue of M, what can you say about the 


pivot positions of M — Al? 
Suppose M — cI has linearly independent columns. Can 
c be an eigenvalue of M? Explain. [A]-351 


Use row reduction to find the eigenvectors corresponding 
to the given eigenvalue. Write your answer in parametric 
vector form. 


9 fd. «5 
33 17) -25 |;A=6 


3.7. THE INVERSE REVISITED 113 
149 18 45-51 -24 -60 
(ec) A=| 12 17 -18 |:a=26 [A}-352 _| 15 107 18 0 7 
12-9 8 QAt as 17 og. oo 4 = 70 
-30 -34 -16 50 
352 
(f) A . : - | Ao  .  e 
- 7 iS 26 -79 -1 = -93 
“6 12 WAS) S16: ite. <4 <ig9g- 74> 8 
%® 61 #41 75 
Answers 


eigenvectors if 2 is an eigenvalue of M, then the associated eigenvector, v, satisfies (M — ADv = 0. For 


v[22] 


-21 53 
and A = 4, that means the unknown eigenvector satisfies 


—21 49 
—21 49 


|v=0. 


This system can be solved by reducing the augmented matrix 


—21 49 0 
—21 49 0}° 
Subtracting row | from row 2 yields 
—21 49 0 
0 O 0} 


v2 1s a free variable and v; = Fy = ty. In parametric vector form, 


or equivalently 


for any r. Speeding up the process for the eigenvalue A = 32, we need to reduce the augmented matrix 


-49 49 0 
-21 21 0 


Again the second row disappears with one row operation, leaving 


-49 49 0 
0 oO 0 


from which we deduce v; = v2. The solution is therefore 


aH 


for any r. 


Part II 


Matrix Abstraction 
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Vector Spaces and Inner Product Spaces 


Abstraction is at the heart of most of mathematics. It is an essential vehicle for the development of new ideas. Take 
natural numbers (one, two, three, and so on), for example. These numbers have a “natural” meaning—quantity. If 
you have a number of objects before you, then there are one or two or maybe twenty-two. The objects are there. They 
can be counted. But what does it mean to “count” the number of objects before you when there are none? The very 
idea is an abstraction of the notion of counting, and it leads to the number zero. To the natural numbers, we add this 
number we call zero, and say that it represents the number of objects you have when you have none. You cannot 
explain zero in the same concrete way you can explain one or two or three. It requires an intellectual leap in one’s 
understanding of counting. 

The use of variables requires another intellectual leap. It is one thing to say that 4+5 = 5+4, but it is quite another 
to say that x + y = y+ x. The unspecified quantities are abstractions of numbers. Much like zero, they represent 
something that is not readily available to see or put in other concrete terms. The very idea of using unspecified 
quantities gives rise to an entire branch of mathematics—algebra! Many branches of mathematics revolve around 
similar abstractions. A body of objects or ideas is stripped down to its essence, which then gives rise to similar but 
new objects or ideas. 

Thinking of the numbers one, two, three, and so on as counting numbers allows the abstraction of “counting” 
zero objects. Using symbols other than numbers to stand for numbers allows the abstraction of unspecified quantities 
in an expression. Abstraction provides a new perspective from which new mathematics can bloom. This idea is the 
foundation for many branches of mathematics. Abstract algebra is built upon abstraction of binary operators such as 
addition and multiplication, both of which are associative, admit an identity element and inverses, and are closed on 
certain sets of real numbers. Topology is built upon abstraction of open intervals of the real line, arbitrary unions of 
which are also open, and closed intervals of the real line, finite intersections of which are also closed. Non-Euclidean 
geometry is built upon abstraction of the idea of a line. Analysis is built upon abstraction of finite quantity and size 
to infinite quantity and infinitesimal size. At some level, each branch of mathematics is based upon abstraction. 

This part of the book proceeds in this vein. The essential ingredients of objects and ideas already explored and 
understood in a concrete way (vectors, matrix multiplication, and dot product to be specific) will be extracted from 
their concrete settings, opening doors to new mathematics. 


4.1 Vector Spaces and Span 


Back in section 1.5, the definition of linear combination contained a proviso: “let S be a set of objects on which 
addition and scalar mutliplication are defined.” At the time several examples of such sets were given, but not much 
was made of other properties the set should have. Then in section 3.3, the definition of linear independence contained 
an extended proviso: “[l]Jet S be a set of objects on which addition and scalar mutliplication are defined and which 
contains an additive identity, called 0.” Existence of an additive identity was necessary to write the equation in the 
definition. Yet again, little was made of additional properties the set S should possess. 

Implicit in the assumptions that addition and scalar multiplication are defined is that sums and scalar products of 
objects in the set S are also in the set S. This property is called closure of the operation. The set T = {t,?,1 +17} 
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is not closed under addition nor scalar multiplication. Can you verify this? Answer on page 123. The set S = 
{at+ Bt? +y1+ ’):a, B,y € R} containing all linear combinations of elements of T (with scalars from the set of real 
numbers) is closed under addition and scalar multiplication. Can you verify this? Answer on page 123. Closure is an 
essential (and until now unmentioned) property of linear combinations and linear independence, and there are others. 

We are so used to the basic properties of real numbers, such as associativity and commutativity, it is easy to take 
the subtleties of computations for granted. In order for r(at + pr +y(1+27)) to equal (ra)t + (rt? + (ry) +2), for 
example, we need to distribute the r and associate scalars. Distribution alone yields r(at) + r(6t) + r(yvU. + t”)). The 
distributive and associative properties are critical to the claim that S is closed under addition and scalar multiplication. 
In order for [out +Bie+y,(1 + P)|+]aot + Bot +y2(1 + P)| to equal (a +a@)t+ (8, +Bo)2 +(y1 +y2)(1 +2) we need 
a slightly different distributive property, an associative property, and commutativity of addition! These properties are 
also essential. 

It is time to make explicit all the properties of matrices we have taken for granted when discussing linear combi- 
nations. This exercise will provide an abstract footing from which to explore, revealing similarities between certain 
sets of objects that could otherwise easily go unnoticed. 

A nonempty set V on which addition and scalar multiplication are defined is called a vector space if for every 
u, Vv, W in V and all scalars s, t 


1. u+vis in V (V is closed under addition) 
2. u+vV=Vv+u (addition is commutative) 
3. u+(v+w) = (u+ Vv) + w (addition is associative) 
4. there is an element 0 in V such that 0 + u = u (an additive identity exists) 
5. there is an element —u in V such that u + (—u) = 0. (every element has an additive inverse) 
6. sv is in V (V is closed under scalar multiplication) 
7. lv=v 
8. s(u+v) = su + sv (scalars distribute over elements of V) 
9. (s+ fu = su + tu (elements of V distribute over scalars) 
10. s(tu) = (st)u (scalar multiplication is associative) 


The set of all n x 1 matrices (vectors of size n) with real entries is the model from which this list is derived. We 
use the symbol R for the set of all real numbers, and use the symbol R” for the set of all ordered lists of n real 
numbers, their representation as n x 1 matrices with real entries giving meaning to the operations of addition and 
scalar multiplication. For V = R"”, properties 2,3,4,8,9, and 10 are taken directly from the theorems of section 3.1; 
property 5 is addressed in exercise 16 of the same section; and the other three follow from properties of real numbers. 
Property 7 is clear in R” since | - r = r for any real number r. Closure (properties 1 and 6) are also clear in R” since 
sums and products of real numbers are real. Hence, these ten properties describe the essential features of R”. 

The set P,,(R), the set of all polynomials with real coefficients and degree n or less, is a vector space!. We will 
verify this shortly by showing that this set satisfies the 10 properties of a vector space. Proof that other sets form 
vector spaces will be requested in the exercises. Elements of any vector space V are called vectors (even if they are 
matrices, sequences, or polynomials). In this abstract sense, even functions are vectors! Proof that P,,(R) is a vector 
space is mostly an exercise of observing that polynomials have the right properties rather than demonstrating that 
polynomials have the right properties. 


1. The sum of two polynomials of degree n or less is a polynomial of degree n or less. (closure under addition) 
2. For any polynomials p(x) and q(x), p(x) + q(x) = q(x) + p(x). (addition is commutative) 


3. For any polynomials p(x), g(x), and r(x), p(x) + (q(x) + r(x) = (p(x) + g(x) + r(x). (addition is associative) 


' with the understanding that adding polynomials and multiplying polynomials by real numbers work as in high school algebra. 
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4. The polynomial z(x) = 0 is in P,,(R) and has the property that z(x) + g(x) = q(x) for any polynomial g(x). (an 
additive identity exists) 


5. Given any polynomial p(x), p(x) + (—1 - p(x)) = z(x) and —1 - p(x) is in P,,(R). (every element has an additive 
inverse) 


6. For any polynomial p(x) of degree n or less, rp(x) is a polynomial of degree n or less. (closure under scalar 
multiplication) 


7. 1+ p(x) = p(x) for any polynomial p(x). 


8. r(p(x) + g(x)) = rp(x) + rq(x) for any real number r and polynomials p(x) and q(x). (real numbers distribute 
over polynomials) 


9. (r + s)p(x) = rp(x) + sp(x) for any real numbers r and s and polynomial p. (polynomials distribute over real 
numbers) 


10. r(sp(x)) = (rs)p(x) for any real numbers r and s and polynomial p. (scalar multiplication is associative) 


If the properties of polynomials were not so well known, each of these points would need to be accompanied by 
reference to a theorem or axiom for support. Even with great familiarity, the same four properties that derived from 
properties of real numbers (1,4,5,6) require further attention. Closure (properties | and 6) cannot be taken for granted. 
These properties only follow because scalar multiplication and polynomial addition do not increase the degree of a 
polynomial. Properties 4 and 5 assert that an additive identity is in the set and for every element of the set, its additive 
inverse is also in the set. These properies are not automatic for arbitrary subsets of polynomials. It is critical to note 
that P,,(R) contains them. 

Coincidentally, given a subset H of any vector space V, the same four properties are the only ones that need to be 
shown true (in #) to prove that H is itself a vector space. The other six properties are inherited by H through the fact 
that they hold for all elements of V (including those in H). In fact property 5 can be deduced once properties 1 and 6 
have been established. 

Suppose H is a subset of a vector space V, properties | and 6 hold for H, and h is a particular but arbitrary element 
of H. Because H is closed under scalar multiplication (property 6), —1-h is in H. Because H is closed under addition 
(property 1), h+(—1-h) is in A. Buth+(-—1-h) = (1 + (-1))h = Oh by property 9. If it were true that Oh = 0 (like it 
is in R”), we would be done as —1 - h would then have been shown to be an additive inverse of h in H. Can you prove 
that Oh = 0 using only the ten properties of a vector space? Answer on page 123. Hence a subset of a vector space is 
a vector space itself if it satisfies properties 1, 4, and 6 (closure and containment of the zero vector). 

Any subset of a vector space that is itself a vector space is called a subspace. One common way to define a 
subspace is through the collection of all linear combinations of a set of vectors (reminiscent of how T = {t, ’,1+P} 
is not closed under addition or multiplication but the collection of all linear combinations of elements of T is). Such 
a set is called the span of those vectors. To be precise, if v;,V2,..., Vg are elements of a vector space (vectors), the 
span of V1, V2,..., Vx, denoted span{vj, V2,..., Vx}, is given by 


span{v1, V2,..-, Ve} = {C1V1 + CoV2 + +++ + CKVE C1, C2,..., Cx are scalars} . 


Can you show that given any vectors Vj, V2,..., Vx of a vector space, span{V), V2,..., Vx} is a subspace? Answer on 
page 123. For consistency with the notion of exercise 26 of this section and the nonemptiness of vector spaces (and 
the concept of a basis, which will be studied later), the span of the empty set is {0}. That is, span{} = {0}. 


Key Concepts 


vector space A set V on which addition and scalar multiplication are defined is called a vector space if for every 
u, Vv, w in V and all scalars (real numbers or complex numbers) s, t 
1. u+vis in V (V is closed under addition) 
2. u+vV=v+u (addition is commutative) 
3. u+(v+w) = (u+ Vv) + w (addition is associative) 


4. there is an element 0 in V such that 0 + u = u (an additive identity exists) 
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there is an element —u in V such that u + (—u) = 0. (every element has an additive inverse) 
sv is in V (V is closed under scalar multiplication) 
lv=v 


s(u + v) = su + sv (scalars distribute over elements of V) 


SO 100: Oy 


(s + fu = su + tu (elements of V distribute over scalars) 


10. s(tu) = (st)u (scalar multiplication is associative) 


vector an element of a vector space. 


span for vectors Vj, V2,...,V,% of a vector space, the span of V;, V2,..., Vz, denoted span{vj, V2,..., Vz}, is the collec- 
tion of all linear combinations of vj, V2,..., Vz. That is, 
span{v1, V2,..., Ve} = {C1V1 + CoV2 +°°* + CEVE C1, C2,..., Cx are scalars} . 


Additionally, span{} = {0}. 
subspace a subset of a vector space that is itself a vector space. 


subspace conditions a subset of a vector space is a subspace if it contains 0 and is closed under addition and scalar 
multiplication. 


span as subspace given any subset H of a vector space V, spanH is a subspace of V. 


Notation 


Some subsets of the complex numbers are denoted as follows. 
Z = {z: zis an integer} 
Z* = {z: zis a positive integer} 
Q = {q : gis a rational number} 
R = {r: ris areal number} 
C = {c : cis acomplex number} 
Some sets, each of which has a natural definition as a vector space, are denoted as follows. F is taken to be one of? 
Q, R, or C, and D is taken to be a subset of R. 
R” = {v: vis an ordered list of n real numbers} 
RN = {s: s is a sequence of real numbers} 
Muxn(F) = {M : M is an m X n matrix with entries in the set F} 
P,(F) = {p : pis a polynomial of degree 7 or less with coefficients in F} 
F(D) = {f : fis a function from D to R} 
C(D) = {f : f is a continuous function from D to R} 


Exercises (b) S = the line with slope m passing through the ori- 


in = yiys ;Ve= R2 
1. Verify that the set, with its well known operations of ad- Be ey) sy Sate 


dition and scalar multiplication, forms a vector space. (c) S = the set of polynomials of degree two or less 
having 12 as aroot = {p(x) = ax*+bx+c : p(12) = 
(a) Mnxn(R) 


0}; V = P2(R) 
b) F((0,1]) [S]-303 : : : 
(>) FQ. 1) IS] (d) S = the set of cubic polynomials with degree 


3 
(c) CR) at most three and 3 and 18 as roots = {q(x) = 
(d) RS [The “well known” operations are done dy + A,X + ax? + a3x° : GB) = g(18) = 0}; 
component-wise. ] V = P3(R) [S]-304 
2. Verify that S is a subspace of the vector space V. You do (e) S = the set of functions on [0,1] whose graphs 
not have to verify that V is a vector space. pass through (0, 0) and (1,0); V = F({0, 1]) 
(a) S =the x-axis = {(x,y): y=0}; V= R? [S]-303 (f) S =the set of real number sequences that converge 


?F could actually be any field. 
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to zero = {r,1%,... 2 liMy soo Mn = O}; V = RN 


(g) S = the set of real 3 x 3 matrices with zero deter- 
minant; M3,.3(R) 


3. Show that S is not a subspace of the vector space V. You 
do not have to verify that V is a vector space. 


(a) S = the first quadrant of the plane, including the 
axes = {(x, y) : x > O and y > 0}; V = R® [S]-304 


(b) S = the first and third quadrants of the plane, 
including the axes = {(x,y) : x => Oandy > 
0} U {(x,y): x < Oandy < 0}; V=R? 


(c) S = the closed disk of radius 5 with center at the 
origin = {(x, y) x7 +? < 25}; VV =R? 


(d) S = the set of polynomials of degree two or less 
with y-intercept (0,-3) = {p(x) = ax-+bx+c: 
p(O) = —3}; V = P2(R) [S]-304 


(e) S = the set of polynomials of degree three or less 
such that 7 is not a root = {g(x) = dg + a)xX+a)x* + 
x: q(7) #0}; V = P3(R) 


(f) S = the set of polynomials of degree four or 
less with its real roots between 0 and 10 union 
the set containing the zero polynomial = {q(x) = 
dy + A,X + dyxX* +037 + atx* : g(x) =O S0< 
Xo < 10 or q(x) = 0 for all x}; V = Pa(R) 


(g) S 
verge to three = {r,7f2,... : 
V=RN 

(h) S =the set of all real 3 x 3 matrices with determi- 
nant one; M3,,3(R) 


= the set of all real number sequences that con- 
Himnsoo Mn = 3} 


4. Describe spanS geometrically. 


(a) S -{| ; } [S]-304 
> |} [S]-304 


+ | 3 ss 


| 
|} 


[A]-352 


OonNnwy coccooco coco 


5. 


10. 


12. 


0 
. Let S be the set of all vectors of the form | -3t Find 


. Let S be the set of all vectors of the form 


He 


Describe spanS in words. 


(a) S = {(1,1,2,1,1,1,1,1,1,1,...)} [A}-352 
(b) S = {(1,2,3,4, 5,6, 7, 8,9, 10,...)} 

(c) S = {p(x) = x1 — x)} 

(d) S = {p(x) = 3} [A]-352 


safle oH 3 


1 

1 0]}f 0 

os = {fo oblo 
305 


0 0 
1 ]*| 0 
Let S be the set of all vectors of the form 


set of vectors in R? whose span is S. 


a set of vectors in R? whose span is S. 
Let S be the set of all vectors of the form 
St 


set of vectors in R* whose span is S. [S]-305 


Let S be the set of all vectors of the form 


3t-2s 
St+s 
Find a set of vectors in R* whose span is S'. [S]-305 
3t-4r 
—t+4r | 
St 


Let S be the set of all vectors of the form 


Find a set of vectors in R? whose span is S.. 

Stt 
2s—t 
Find a set of two vectors in R? whose span is S. Is 


oll HU =" 


1 
| 2 |» spans ? 
3 


3 
(a) S=3] 2 | 
1 
a 10 
(b) S=4] 8 fF 2 | [S]-305 
9 ~6 
2 -10 
(c) S=2| -3 |,} -24 |$ [A}-352 
5 41 
2 -10 8 
(d) S =4| -3 |,] -24 |,] 1 
5 4l =2 
1 1 1 
- : : | | | 
0 0 1 
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-3 ait ee -12 —560 
() S=4} O |,{ -3 |,| -32 |} [A}352 u, =| -20 |;u.=| 255 |; 
0 0 : 20 -100 
, : -8 
2 2 glo 
13. Is vin span {sin 8, cos oh is | 25 |: [A}-352 
(a) v= (f(@) = 3) -20 
(b) v = (f(@) = sin(26@)) ) Sagetiath ell 411 
SageMath Cell a : 
(c) v= (f(@) = cos(26)) [S]-305 ”) @ = oS 
_ Le ees) 
(a) v=(f@ = sin"(@/2)) a0 eS ais 
14. Is vin span {(1, 2,3,4,5,...), u,=] 11 |;uw=] -6 |;u,;=] 100 |; 
(0, 4,0,0,0,...),¢1, 1, 1,1, 1,...)}? —5 17 397 
(a) v = (5,10, 15,20, 25,...) [A]-352 Z 
(b) v = (3,87, 9, 12, 15,...) () QRS 50. - isl 
(c) v = (78, 81, 84, 87, 90,...) [S]-305 44 
(d) v= (2,4, 8, 16, 32,...) 10 -4 8 
11 4 -12 
(e) v = (-3,-1,-3,-2,-3,...) m=} 7 pwe=]_59 Pw=Al _5 f [A]- 
(f) v = (0, 32, 4,6, 8,...) [A]-352 -7 -7 -6 
352 
15. Is b in the span of the columns of M? 79 
—240 SageMath Cell _| 33) I. 
(a) ee 145 =| —406 |- (a) AS) Saver pe =| 3105.) 
-416 366 
-499 -288 232 4 -10 -3 
M=| -425 -161 125 | [S]-306 i ee a Se eet Ep 
306-348 141 oe ee eee ee es Os 
11 -6 -l11 
—284 
(b) £9) Sagettath Cell 45b=| 146 |; 17. For each pair v and S, v is in spanS. Show that S U {v} 
—187 is linearly dependent. 
237-228 -234 _4 2 -1 
M=| -440 -772 36 (a v=] 4 PS = 4) 5 |] 3 [p (Ab352 
243, -612 -366 18 i 6 g 
186 mrla bs} pA} 
—352 
© OESD 0, -| 3 | eee eet oa) 
273 


(d) v=5P -9t4+5;S = {12,0} [A]-352 
-118 35 -94 -170 


(e) v = (0,0, 0,0, 0,...); 
277-101 496 -336 7 
M=\ oy. os a0, ae. ee FS(2,-3,4 51.00) 
182-233 326 -252 (f) v = (2,3, 4,6, 6,9, 8, 12,...); 
S = {(,0, 2,0, 3,0,4,0,...), 


2 (0, 1,0,2,0,3,0,4,...)} 
_| 30 | 
(d) © a b= -120 1 18. Argue that (in general) if v is in spanS, then S U {v} is 
-437 linearly dependent. 
-133 -77 -153 209 19. Find k so that kf? + 9t — 8 is in 
uu| 69 323 490-98 span {37° - 4, 47° — 37}. 
“| -419 15 305 129 2 3.5 
-157 -10  -230 377 20. Let M = 1 82 } Is the set of all v such that 


16. Is v in span {u;,, Uo, U3}? 0 
Dan AW Mae Usl uy =| 0 | ssubspace of 2 of R3? 


21. Suppose u, and up both have the property of being a zero 


-—7424 
vector. That is, for any vector v, u; +V = Vand u)+V = V. 


OED... Bes 


—1060 
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22. 


23. 


Supply a reason for each line of the string of equalities 
proving that u, = Ub, thus proving that the zero vector is 
unique [and therefore 0 is the additive identity]. [A]-352 
u; =U, +U, 
=U, +W 
Prove that for any vector v in a vector space, —1 - v is an 
additive inverse of v. Use the fact that Ov = 0. 
Suppose u, and u, both have the property of being an 
additive inverse of v. That is, for this arbitrary but par- 
ticular vector v, u; + Vv = 0 and uw, + v = 0. Supply a rea- 
son for each line of the string of equalities proving that 
u; = w, thus proving that additive inverses are unique 
{and therefore —1 - v is the additive inverse of v]. 
u; =u, +0 
=u, +(U) +V) 
=u, +(v+u) 
= (u; + Vv) + UW 
=0+ Uo 


=W 


Answers 


24. 


25% 


26. 


27. 


28. 


Let c be any scalar. Supply a reason for each line of the 
string of equalities proving that c0 = 0. 


c0 = c0+0 
= c0 + (c0 + (-(c0))) 
= (c0 + c0) + (-(c0)) 
c(0 + 0) + (-(c0)) 
c0 + (-(c0)) 
0 


ll 


ll 


Suppose cu = 0 for some nonzero scalar c. Create a 
string of equalities showing the u = 0 and justify each 
equality in the string. 


Suppose V is a vector space, H is a subset of V, and S$ 
is any subspace of V containing H. Explain why spanH 
must be a subset of S. This shows that spanH is the 
smallest subspace of V containing H. 


Let S and T be subspaces of a vector space V. Show that 
S OT is asubspace of V. 


Find subspaces S and T of a vector space V such that 
S UT is not a subspace of V. 


not closed With T = {t,??,1+?7},t€ T and? € T, butr+/?’, the sum of two elements in T, is not itself in T. 171, a 


scalar multiple of an element of 7, is not itself in T. 


closure of linear combinations With S = {at + Bf + y(1+?f): @,B,y € Rj, an arbitrary element of S has the form 
at+Br +y(1+27) for some scalars a, 8, y. The scalar multiple of an arbitrary element of S, r(at+6P+y(1+07)) = 
(ra)t + (r)t + (ry) + t) has the form of an element of S and is therefore in S. The sum of two elements of 
S,fait+BiP + yi +P)] + [art + Bot? + yo(1 + P)| = (a1 + a2)t + Bi + Bo)? + (V1 + ¥2)(1 +7) has the form 


of an element of S and is therefore in S too. 


the zero vector Proving that Ov = 0 for any vector v of any vector space V is a very abstract and subtle chore. Where 
to start is a particularly befuddling question. Proofs of this nature are very difficult when seen for the first time. 
You are in good company if you did not come up with a proof of your own. One way to proceed is to start with 
one side and produce a list of equalities that flow logically from the properties and lead to the other side. Let v 
be a particular but arbitrary element of a vector space V. 


span is a subspace Let vj, vo,.. 


Ov = 0v+0 

Ov + (Ov + (—Ov)) 
(Ov + Ov) + (—Ov) 
= (0+ 0)v + (-Ov) 
= Ov + (-Ov) 

=0 


[ 
[ 
[ 
[ 
[ 
[ 


., Vx be elements of a vector space and set H = span{v, Vo,... 


0 is the additive identity] 
property of additive inverses] 
addition is associative] 

scalars distribute over vectors] 
substitution of 0 for 0 + 0] 
property of additive inverses] 


» vx}. Then 


1. [property 4] Ov, + Ov. +---+0v, =0+0+---+0=0is ind. 


2. [property 1] for any elements u and v of H, there are scalars b,, bo, .. 


u = by, + bon +--+ + Duy 


., Dd, and cy, C2,...,C, such that 


and V=ciVj + CoV2 +-++ + CK 
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NiO) 
ut+v= (b1vy + boVo +0+°+ bevy) + (ci V1 + CoVo2 tee + CKVE) 
= div +c,Vv, + boV2 + +C2V2°°°+ DEV + CKVE 
= (by + C1)V1 + (bz + C2)V2 Sere (by + CkK)VK 
isin H. 


3. [property 6] for any element u of H, there are scalars b;, b2,..., bg such that 
u= div, + DoVn +++ + DEVE 
so given any scalar s, 


su = s(by vy + boV> Pores DEV,) 
= 5(b1V1) + s(b2V2) + +++ + S(DKVE) 
= (sb, )v, + (sb2)V>2 ha ie (shi )Vx 


is in H. 


4.2. BASIS AND DIMENSION 125 


4.2 Basis and Dimension 


Every vector in R” can be written as a unique linear combination of the columns of the n x n identity matrix. Think 
about it for a moment. Can you justify this claim? Answer on page 129. 

Taking the perspective that the matrix-column product /v is a linear combination of the columns of J with coeffi- 
cients from v, what we are claiming is that for any vector b, the equation 


Iv=b (4.2.1) 


has exactly one solution v. By the invertible matrix theorem, this is equivalent to claiming that J is invertible, which 
of course is true! 

Noting that the solution of (4.2.1) is v = b we see the vector b itself holds the coefficients of the proclaimed 
linear combination. There is nothing ground-breaking about the calculation itself. However, it strikes at the essence 
of elements of R”, paving the way for abstraction to arbitrary vector spaces. 

Given any subset of a vector space, we can represent linear combinations of these vectors by the element of R” 
holding the coefficients of the linear combination. If there were special subsets where each element of the vector 
space could be represented by a unique linear combination (as in the example of the columns of J), this perspective 
would hold promise. As it turns out, there are many such sets. For example, T = {t,*, 1 + 17} forms such a set in 
P.(R). Can you verify this? Answer on page 129. 

In R” the columns of any invertible matrix M form such a set. If M is invertible, the invertible matrix theorem 
tells us that for any vector b there is exactly one solution v of the equation 


Mv =b. 


In terms of linear combinations, every vector b can be written as a unique linear combination of the columns of M. 
In the terminology of the previous section, the fact that Mv = b always has a solution is to say that the columns of M@ 
span R”. The linear independence of the columns of M ensures uniqueness of solution (by theorem 5). These are the 
characteristics of the columns of an invertible matrix that makes it a special set—linear independence and span. 

The terms linear independence and span have meaning in any vector space, not just R”, motivating the following 
definitions. A subset S of a vector space V is called a spanning set if spanS = V. A subset of a vector space V is 
called a basis (of V) if it is a linearly independent spanning set. 

From the discussion that led to the definition of a basis, we can be sure that a basis of R” has the special property 
that every vector in R” can be written as a unique linear combination of the vectors in the basis. However, theorem 5, 
which provided uniqueness does not apply to arbitrary vector spaces. We will need to argue that if 8 is a basis of an 
arbitrary vector space V and v is in V, then v is expressible as a unique linear combination of the vectors in 8. 

Let 8 = {v1,V2,...,V,} be a basis (linearly independent spanning set) for a vector space V and let v be a vector 
in V. Because 8 is a spanning set, every vector in V can be written as a linear combination of the elements of 8, 
including v. Thus v is expressible as at least one linear combination of the vectors in 8. It remains to show v is 
expressible as at most one linear combination of the vectors in 8. To that end, suppose v has two representations as 
linear combinations of the elements of 8. That is, 


V=4,V, + ao.V. +-+++a4,V, and V=bd Vv, + boV2 +--+ + dDpVp. 


Then 0 = v—v = (a1 Vj + G2V2 ++ ++ +GnVn) — (b1V1 + boV2 + +++ + DyVn) = (a1 — b1)V1 + (Qo — b2)V2 +++ + (An — Dn)Vn. 
Since & is a linearly independent set, the only solution of this equation is 


a, — b} =a2— bn =++- = a,—b, = 0 


and therefore the two linear combinations are equal. So, given a basis of a vector space, every element of the vector 
space can be written as a unique linear combination of the elements of the basis. 

Bases (the plural of basis) have another important property. If there is a basis of a vector space with n elements, 
then any subset of the vector space with more than n elements is linearly dependent. 

Suppose B = {vj,V2,...,V,} is a basis of a vector space V and S = {u),Up,...,u,} where p > n. Since Bisa 
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spanning set, each element of S' can be written as a linear combination of the elements of 8. Let 


uy = Mi iv, + Moiv2 +--+ + Maivn 
Ub = Mi 2M + M2 2V2 Dail Mn2Vn 


Up = Mi pv1 + Mo pv2 +--+ + Mi pVn 


Then 
Mi; Miz --- My 
M2, M22 --- Moy 
M= : : : 
Maa Mn2 ai Mn,p 


has more columns than rows, so M cannot have a pivot in each column. By theorem 5, there is a nonzero vector w 


T 
such that Mw = 0. Let w = [ Wi W2 «tt Wp | be such a vector. Then 


WU) + Won +++ + Wy = wiMiiv1 HE w1M1V2 tee tw Ma iVn 


+ Ww2M1 V1 + W2M)o V2 Sees ef W2Mn2Vn 


+ WM, pVi + WpM2 pv2 + +++ + WpMi pVn 
= (Miiw1 + Mi2w2 +--+ + Mi pwp)v1 
+ (M,1w1 + Mz2W2 + +++ + M2 pwp)V2 


+ (Mniw1 + Mn2W2 is Mn,pWp)Vn 
= Ov, + Ov2 + --- + OV, 
=0. 


Since w # 0, we have demonstrated a nontrivial linear combination of the vectors in S that sum to 0, showing that S$ 
is linearly dependent. This fact is important enough to repeat as a theorem. 


Theorem 9. [fa vector space V has a basis with n elements, then any subset of V containing more than n elements is 
linearly dependent. 


To understand the implications of this theorem, let V be a vector space with basis 8 = {Vv}, V2,..., V,} and consider 
two subsets of V: F containing fewer than n vectors and G containing greater than n vectors. G cannot be a basis 
because it has more than n elements (which makes it a linearly dependent set). F cannot be a basis for V because if 
it were, that would make S linearly dependent thereby contradicting the fact that 8 is a basis. In other words, if a 
vector space V admits a basis with n elements, all bases of V have n elements. This number z is thus a characteristic 
of V and deserves a name. We call the number of elements in a basis the dimension of a vector space. The trivial 
vector space, {0} contains only the one vector 0 and, by definition, has dimension 0. Observe that 


1. the dimension of R” is n, and 
2. the dimension of P2(R) is 3 
because 
1. the columns of /,,.,, form a basis of R”, and 


2. T = {t,t?,1 +17} forms a basis of P2(R). 
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{Li,l2,...,/:,} is called the standard basis of R” and {1,7,...,7”} is called the standard basis of P,,(R). 

We already know that if a vector space V has dimension n, then any subset of V containing fewer than n elements 
is not a basis. It therefore must either be linearly dependent or not span V. We will argue that such a set does not 
span V. The statement is clearly true when the dimension of V is 0 (there are no sets with less than 0 elements) and 
n = | (only the empty set has less than one element, but the empty set does not span a nonempty vector space). For 
n= 2, let S = {u),Uy,...,U,} be a subset of V with p < n and suppose S is a spanning set. Because S spans V but is 
not a basis, S must be linearly dependent. This is a contradiction in the case n = 2 since any set with one element is 
linearly independent. In the case n > 2, S contains some element that can be written as a linear combination of the 
others. By rearranging the order of the elements in S if necessary, let u; be the element that can be written as a linear 
combination of the others and write 


Uy = C2Uy + C303 + +++ CyUp. (4.2.2) 
Note that § = S\{u;} = {U2,u3,...,U,} still spans V: for arbitrary v in V, write 
V = CyUy + C2Uy + +++ Cy, (4.2.3) 


and substitute (4.2.2) into (4.2.3), thereby writing v as a linear combination of the elements of S. If § is linearly 
dependent, repeat the process of removing an element that can be written as a linear combination of the others, and 
continue until the remaining set is linearly independent (and still spanning). This will happen when there is one 
element left, if not sooner, so the process is guaranteed to end. At this point what is left is a basis with less than n 
elements, an impossibility. It must be that S is not spanning. 

Key Concepts 

spanning set S is a spanning set of a vector space V if spanS = V. 

basis a linearly independent spanning set of a vector space. 

dimension the number of vectors in a basis of a vector space. 

standard basis of R” the columns of [,.y. 


standard basis of P,,(R) {1,f,...,7"}. 


linear dependence if a vector space V has dimension n, any subset of V with more than n elements is linearly 
dependent. 


spanning if a vector space V has dimension n, any subset of V with less than n elements does not span V. 
bases all bases of a given vector space have the same number of elements. 
trivial vector space {0} contains only the one vector 0 and, by definition, has dimension 0. 


Exercises =e) “15 
(a) 238 |, ae 
1. State the dimension of the vector space. ~250 
(b) R® [A]-352 (b) 


384 -78 
—391 |,| -262 |°|| 
(c) R? —339 349 


| | 
coo off |] | 
| 


(a) R° 


423 
—218 


[A]-352 
(f) Ps(R) [A]-352 : 

(g) PR) 

(h) P,(R) (d) 


2. Explain why the set is not a basis of R>. 


165 a -37 
-136 |’| 308 198 
-34 ~264 


51 ; 491 
IE 
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115 46 -101 —94 
(e) 136 |,] 90 |,} -122 },] —-91 
111 -131 105 —148 


3. Explain why the set is not a basis of P3(R). 


(a) {168+217t-115 +3838, 604+3491+498P +24377} 


(b) {-407+342r-94/? 3543, 188-272-1843, 117+ 
114t — 4377 + 49315} 


(c) { 

(d) {0,t,2?,f} 

(e) { 

(f) (27 —70t, -8972 + 
1001} 


—9+1,-44 9t- 


’,7—-2,-2 + 7t- 6f7} 


t,,0,t*} [A]-352 


52,1 + 58t, -277 — 742°, 28 — 


4. Each set is a subset of a vector space. In each set, one 
vector is the sum of the other two. (i) Name a vector 


space containing the set, 
the same span. 


9 18 

12 8 
Gl a lit | 

18 7 


and (ii) find a proper subset with 


9 

-4 
7 

=n 


(b) {9—4t+777, -20+81+179, -114+4t+241°} [A]-352 
(c) {(-4,-7,-10, -13,...),(1,2,3,4,...), 


(=93;—9, 1351/50 


-19 


offs o || 
Een 


)} 
<9 <9 
14 et ||? 


[S]-306 


(e) {3 sind — 2cos6, 8 sin6 + 2 cos 6, 


5 sin@ + 4cos 6} 


5. Redo part (ii) of exercise 4, this time stating a different 
subset with the same span. Is this subset a basis for the 
span of the set? [S]-306 [A]-352 


6. Find the dimension of the span of the set. 


2 
2 


r 


a 


1 
@) {15 |. 
2 


a 


(b) {3 +2,1,1 -?} 


(c) {1+ 2r,3+1¢- 2° 


[S]-306 
{fo oflo 


5 
P| 
10 
10 


,-5 +47, -5¢- 2r| 


fli oflo rh 


(e) {cos? 0, sin 6, 1} [A]-352 


oH 


7. Justify the claim that {1, 


8. Suppose span{b;, b2,b3,b4} = 


{b,, bo, b3, b4} is a basis 


=) 4 
> Lp aa 
= 2 


t,?,...,t"} is a basis of P,,(R). 


P;(R). 
of P;(R). [A]-352 


| [A]-352 


Explain why 


11. 


12. 


Suppose {b;, bz, b3, by} is a linearly independent set in 
R*. Explain why {b,, b>, b3, by} is a basis of R*. [A]-353 


. Let V be a vector space with dimension n. Explain why 


a spanning set with n elements must be a basis. 


Let V be a vector space with dimension n. Explain why a 
linearly independent set with n elements must be a basis. 


Do the columns of the matrix (considered as a set of vec- 


tors) form a basis for R*? 


-11 8 18 15 
(ay| 19 2 5 -3 
17 11 20 -2 
-13 19 19 16 
-17 -16 -16 14 
wb) -19 5 5 -10 vanes 
-14 15 15 4 
-14 14. 8 
—2 18 -17 
ee a a 
12 -18 -4 
-17 18 -12 9 
0 -8 4 17 
Ol oO o 12 ~o.| Ae 
0 oO oO. -10 
-7 18 19 -12 
0 0 8 2 
Ol ogo 90 o -1 
0 0 0 0 
9 3 13 Il -5 -6 
-17 12 -14 10 -2 -7 
() =13. 8. 6 te -=15 1¢ |e 


-8 19 -12 15 2 1 


13. Do the columns of the matrix (considered as a set of vec- 


tors) form a basis for R°? 


(a) .)) Sagelath Cell 52 


—89 86 —47 69 -88& 
—27 27 95 -16 93 
—50 150 -73 52 -17 
-6 78 60 84 41 
132 -110 -126 79 —-90 
96 -124 —-41 147 75 
(b) . 3) Sagemath Cell Jay 
7 —4 59 -15 49 
21 -17 232 = =©-55 197 
—21 9 -145 38 -119 
—21 1 -63 2 -45 
-7 -l1 -9 10 -12 
14 -13 173-42 148 
[S]-307 
©) .3)) Sagetath Cell Ja 
-413 -649 26 —-7 -73 
42 66 0 -6 8 
-77 =-121 4 1 -14 
63 99 -6 6 11 
-147 -231 6 6 -27 
308 484 -18 0 57 


-61 
133 
24 
91 
—137 
-117 


56 
242 
-129 
-7 
—33 
190 


—513 
48 
—93 
87 
—178 
356 
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a) QEESETSD 5; 126 -64 -94 98 -40 
-55 -28 100 10 15 
128 -80 -85 141 -116 
-132 112 -13 3  -70 
-117 27 97 106-110 
-149 -54 -66 132 46 


[A]-353 
Answers 
b 
bo 
unique linear combination For an arbitrary vectorb=}] . |inR’, 
Dy, 
by 0 0 0 
0 by 0 0 


Oo 
cand 
So 
So 


=b, 0 + by 0 + b3 | +--+ +d, 
0 0 0 1 
= bly aE bol. a as Dyl.n 


and no other linear combination of the columns of J equals b (because each coefficient of the linear combination 


affects one and only one entry of b). 


unique linear combinations of polynomials An arbitrary element of P2(R) takes the form p(t) = at? + bt +c. To 
write p as a linear combination of the elements of T = {t, ?,1+ 17}, we need real numbers a, B,y such that 


at+Br+y1+r) =a +bt+e 


ory +att+(y+) = at? + bt+c. From here, it is clear that we need y = c, a = b and y + B =a. Solving this 
last equation for 8, we find 8 = a—y = a-c. The algebra shows that not only is this a solution to the problem, 


it is the only one! 
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4.3. Functions and Transformations 


Background 


Give yourself a moment to recall everything you can about functions before reading on. Go ahead. Think about 
it. Close your eyes. Close the book and just think. There are no right or wrong answers. You remember what you 
remember. 

What did you come up with? Common answers include things like “inputs and outputs”, “black box”, “slope- 
intercept’, “graphing”, “zeros”, “f(x)’, “there’s a domain”, and so on. Whatever came to mind is okay—it was 
probably related. Now try to divorce yourself from all those ideas. Free your mind from past experiences, and start 
over. We’ll come back to the familiar ideas of function notation and graphing later. For now try to think more 
generally, more abstractly about functions. 

A function is made from three ingredients—three sets, really. Two of the sets may contain any types of objects— 
real numbers in each, fruits in one and colors in the other, car companies in one and countries in the other, subsets 
of real numbers in one and polynomials in the other, matrices in one and integers in the other—no restrictions. The 
definition of function does not specify. These two sets are called the domain and codomain. Any set can be the 
domain of a function, and any set can be the codomain of a function. The only requirements of a function are placed 
on the third set. This set must contain exactly one ordered pair for each element of the domain. The order of the 
elements is important too. The first component of each ordered pair must be an element of the domain and the second 
component must be an element of the codomain. 

A relation, like a function is made from three sets, a domain, a codomain, and a set of ordered pairs where the first 
component of the orderd pair is an element of the domain and the second component is an element of the codomain. 
Unlike a function, there are no further requirements. Thus, a relation can be thought of as a relaxed function. It has 
to adhere to fewer rules. Remember, the set of ordered pairs defining a function must contain exactly one ordered pair 
for each element of the domain, but a relation does not have this restriction. 

To be precise, even though a relation is composed of three sets, A, B, C, where C is any subset of {(a,b) : 
ais in A and bis in B}, the set C is the relation itself. The sets A and B are just ingredients in the definition of C. To 
simplify the notation, the set {(a, b) : ais in A and b is in B} is denoted A x B, read “A cross B”. To rephrase then, if 
A and B are sets, a relation is any subset of A x B, and a function is a subset of A x B containing exactly one ordered 
pair for each element of A. 

A relation C might be given using terse set notation C = {(a,b) € A x B: rule of correspondence} such as in 


C ={(a,b)E RXR: a* +b? =}. 


The same relation might be simplified to just a” + b* = 1 if it is understood without writing explicitly that a and b are 
real numbers. 

When a relation is a function, we may emphasize this point using the notation f : A — B, read “f is a function 
from A to B”. Implied in this notation is that f is a function and there exists a rule of correspondence specifying 
exactly which ordered pairs (a,b) € A x B are in f. To define a specific function, the familiar function notation is 
often used, as in f : A > B, f(a) = "insert formula here", meaning f = {(a,b) € A x B: b = "insert formula here"}. 

The domain and codomain of a function are often taken for granted, fading into the background in favor of 
focusing on the rule of correspondence, such as in f(x) = 3x + 11. It is just assumed or implied that the codomain is 
the set of all real numbers and the domain is some subset of the real numbers—with good reason. It would be rather 
repetitive to write f : R — R every time a function on the real numbers came up. More importantly, though, the 
domains of many functions with succinct formulas do not contain all real numbers. The domain only includes numbers 
that correspond to some element of the codomain. It would be a difficult distraction to have to write f : A — R and 
get the set A correct every time a function was mentioned. To be complete, though, the definition of a function should 
include this information, and it is a convenience not to require it. To compensate for this lack of completeness in 
practice, a classic problem in algebra is to find the implied domain of functions such as f(x) = 3x47 __the subset of 
real number inputs that correspond to real number outputs according to the formula. Can you find the implied domain 
of f(x) = #2? Answer on page 135. 

The discussion of inverse functions, which will be very important for us, is made abstruse by not introducing 
the notion of a relation. The inverse of a relation is a relatively simple matter. If C is a relation with domain A 
and codomain B, the relation with domain B and codomain A given by C7! = {(b,a) : (a,b) is in C} is called the 
inverse relation of C. Every relation has an inverse relation, no exceptions. Simply reversing the order of the ordered 
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pairs in any relation gives its inverse. Consequently every function (which is just a special type of relation) has an 
inverse relation, and the inverse is conceptually straightforward. That is not to say that the inverse relation is always a 
function, however. There are plenty of functions that do not have inverse functions, and tackling this problem before 
tackling the simpler one of inverse relations is the root of much confusion and difficulty. 


Maps and Transformations 


Map, mapping, and transformation are all synonyms for function. All four words share the same definition. When 
the codomain is not the set of real numbers, the word “function” is often supplanted by one of the others. Getting 
used to this fact is a matter of experience. 

The 3n + 1 conjecture is that iteration (computing the sequence n, R(n), R(R(n)), R(R(R(n))), .. .) of the mapping 
R:Z* > Z* defined by 


a if n is even 
R(n) = 42 eee 
3n+1 = ifnis odd 
always ends with the cycle 4,2,1,4,2,1,.... Try it starting with n = 13, for example. How many iterations does it 


take to get the first 4? Answer on page 135. Don’t get too distracted, though. The point here is that R is a mapping. 

The transformation A : C(D) > C!(D) given by A(f) = fig f(t) dt is a standard of calculus, though it is less 
common to discuss the antiderivative as a transformation during a calculus class. The derivative provides another 
example of a transformation from some set of functions to another set of functions. For example, D : P,(R) > 
P,-1(R) defined by D(p) = p’(x) is a map from the set of polynomials of degree at most n to the set of polynomials 
of degree at most n — 1. For example, if p(x) = 3x? — 2x + 1, then D(p) = p’(x) = 6x — 2. Mechanically there is no 
advantage to writing the derivative as a transformation, but conceptually it gives a certain perspective on the process 
of differentiation. The process itself can be thought of as a function! 

The map | : RN = RN given by [(so, 51, 52,.--) = 51, 52, 53,... 1S Sometimes called the (left) shift operator. Its 
cousin, the left bit shift operator (called left-shift) plays a huge role in computing. 

The determinant is a function or map det : M;x,(F) — F for each n since each n x n matrix has exactly one deter- 
minant. The transformation Z : GL,(F) — GL,(F) mapping an invertible matrix to its inverse provides motivation to 
think of finding the inverse of a matrix as a function as well. The most common transformation considered in linear 
algebra, however, is the transformation T : R" — R”, T(x) = Mx for some matrix M. Transformations T defined 
this way have two properties: 


1. T(x +y) = T(x) + T(y) for any x and any y in R”. 
2. T(cx) = cT(x) for any c in R and x in R”. 


Can you justify this claim? Answer on page 135. These two properties mean that performing addition and scalar 
multiplication in the domain and then transforming (the lefthand sides of the equations) gives the same result as 
transforming first and then performing the addition and scalar multiplication in the codomain (the righthand sides 
of the equations). For this reason we say this type of transformation preserves the operations of addition and scalar 
multiplication. Its properties form the essence for the abstraction of this idea to arbitrary vector spaces. 


Linear Transformations 


Given vector spaces V and W, a linear transformation is any transformation L : V — W such that for every x, y in 
V and scalar c, 


1. L(x + y) = L(x) + LG) and 
2. L(cx) = cL(x). 


This definition is modeled after T : R” — R”, T(x) = Mx, making T the canonical example of a linear transformation. 
Some of the other transformations mentioned in this section are linear transformations and some are not. For example, 
R is not since R(2 + 3) = 16 but R(2)+ R(3) = 1+ 10 = 11, and 11 # 16. It is easy to find positive integers that violate 
property 1. The derivative is a linear transformation since two basic results of calculus are that (f + g)’ = f’ + g’ and 
(cf) = cf’. These rules of differentiation say precisely that properties 1 and 2 hold when differentiation is viewed 
as a mapping. In other words, D(f + g) = D(f) + D(g) and D(cf) = cD(f). What about A, /, det, and Z? Are they 
linear? Answers on page 135. 
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Vocabulary of Transformations 


If 7: A > Bis a transformation, then 


e T(a) is called the image of a. 


If S is a subset of A, the set of all images, {T(x) : x is in S} is called the image of S. 


The set of all images, {T(a) : a is in A}, is called the range of 7, denoted range(7). 


ais called a preimage of b whenever T(a) = b. 


The set of all preimages of an element b of B is also called the preimage of b. 


the kernel of 7 is the preimage of 0. 


When the inverse relation of T is a function, it is denoted T~! : range(T’) > A and T™! is called the inverse 
function of T or simply the inverse of 7. 


It is not an accident that the notation T~! is used for the inverse of T. The notion of an inverse function extends the 
idea of multiplicative (and additive) inverses. By definition, if T has an inverse function T~!, then T-(T(a@)) = a 
for any a in A. Can you justify this claim? Answer on page 136. Start with a, end with a. Composing T~! with T 
returns the starting value, much like M~'(MA) = A—left-multiplying the matrix A by invertible matrix M and then 
left-multiplying the result by the matrix M~! results in the starting value A. Start with A, end with A. 

Key Concepts 


Cartesian product given any sets A and B, the Cartesian product of A and B is the set {(a,b) : a € Aandb € B}, 
denoted A x B. 


relation given any sets A and B, a relation is a subset C of A x B. 

function given any sets A and B, a function is a relation C such that if (a, b;) € C and (a, b2) € C then b; = bo. 
domain the set A in the definitions of relation and function. 

codomain the set B in the definitions of relation and function. 

rule of correspondence the rule that defines the set C in the definitions of relation and function. 


inverse relation (of a relation T with domain A and codomain B) the relation with domain B, codomain A, and rule 
of correspondence {(T (a), a) : ais in A}. 


inverse function the inverse relation of a function whenever it happens to be a function. 

inverse an inverse function. If T~' : range(T) — A is the inverse of T : A > B, then T~!(T(a)) = a for all ain A. 
invertible a function is called invertible if its inverse relation is a function. 

kernel the preimage of 0 whenever the codomain of a transformation is a vector space. 

map a function, usually used when the codomain is not R. 

mapping a function, usually used when the codomain is not R. 

transformation a function, usually used when the codomain is not R. 


image an element, T(a), of the codomain of a transformation T. Also, if S is a subset of the domain of 7, the set 
{T(x): xisin S}. 


preimage if b is an element of the codomain of a transformation T, then an element a in the domain of T is a 
preimage of b whenever T(a) = b. Also, the set of all elements a in the domain of T such that T(a) = b. 
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range the set of all images, range(f) = {f(a) : ais in A} for any function f : A > B. 


linear transformation given vector spaces V and W, a transformation L : V > W is linear (is a linear transforma- 
tion) if for any x, y in V and any scalar c, 


1. L(x + y) = L(x) + LQ) and 
2. L(cx) = cL (x). 
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Exercises 


1. 


10. 


Let A = {a,b,c,d} and B = (3. i i 4}. Is the set a rela- 
tion from A to B? 


(a) {(a3 


ie) 
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(a) State a possible domain. 

(b) State a possible codomain. 

(c) State the range. 
The set {(1,1),(2,17),(3,0°),(4,74),(5,0°)} is a rela- 
tion. [S]-308 

(a) State a possible domain. 

(b) State a possible codomain. 


(c) State the range. 


Redo question 3 where the set is a function instead of a 
relation. 


Redo question 4 where the set is a function instead of a 
relation. [S]-308 
1 


Let p be the mapping p : Z* —> Q, p(@) = sa- 
(a) Find the image of 3. 
(b) Find the image of 23. 
(c) Find a preimage of h. 
(d) Find the preimage of i. 
(e) Find the image of {1, 2, 3}. 


. Let p be the mapping p : Z > Q, p(z) = za- [S]-308 


(a) Find the image of 3. 

(b) Find the image of 23. 

(c) Find a preimage of z. 

(d) Find the preimage of z, 
(e) Find the image of {1, 2, 3}. 


-rewion = {C0 L13 Ds Ls hs 


subset of R? x R?. Find R7!. 


1 
Relation R = ( +3,) 0 
3 


=| I) 


subset of P,(R) x R?. Find R™!. [S]-308 


11. For the transformation T it is known _ that 
T 1 z 1 -2 T -1 _| -l -2 
2 “| 2 1 7 2 ~) -2 -1 7 
T 7 — ae , and that T is invertible. 
-1 1 3 
: : 3 1 
(a) Find a preimage ot 1 3 } [A]-353 


3 1 
1 3 }) [A]-353 


(c) Find a preimage of | ie Fs } 


1 -2 
—2 1 
(e) Make a general statement about preimages and in- 
verses. 


(b) Find T! ( 


(d) Find T! (| 


12. Find the image of a under the mapping T : R" > R”, 
T(v) = Mv. 


7 -3 3 
(c) M=} -9 -l a=| 75 | 

8 —-4 

2 9 8 10 
(d) M=| -6 O -I |;a=] 6 

-10 -9 10 2 

13. Show that f : R > Ris not a linear transformation. 

(a) f(x) =e 
(b) f(x) = Inx [A]-353 
(c) fay=x 


(d) f(x) = vVx-3 


14. Show that T : R" — R” is a linear transformation. 


(a) T(x) = 13x 


75 )-[ 
orl b[S SI 


offal 


15. Show that L: P,,,(R) — P,,(R) is a linear transformation. 


[A]-353 


(a) L(ax + b) = sax +bx+c 
(b) L(ax? + bx +c) = 6ax + 3b [A]-353 


16. Perhaps ironically, what is known as a linear function in 
algebra is not necessarily a linear transformation. Verify 
that f:R- R, f@®) =mx+b 


(a) is a linear transformation when b = 0 
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Al 


| —2 | 20. Can you find a matrix M such that the transformation T 
, and 


17. Show that det : M>,.(R) — R is not a linear transforma- 


(b) is not a linear transformation when b # 0 
(d) T 
tion. 


18. T : R? > R’ is linear, r(| a } = 


4 of question 18 can be expressed as T(x) = Mx? [A]-354 


‘ 21. Can you find a matrix M such that the transformation T 
7 ( 2 } a Find of question 19 can be expressed as T(x) = Mx? 
=| —-5 |. Fin 
-1 1 x 0 —-2 6 x 
22. Let T}} y =|-7 -4 5 y |. Find the 
-2 
(a) T ( 8 \ Zz 1 2 -5 Zz 
2 
(b) r(| 6 } [A]-353 preimage of | —3 |. Is T invertible? Explain. 
=3 -l 
(c) T ( 1 } 23. The range of the function g : R > R, g(x) = x’ is [0, 09). 
3 Therefore, the inverse relation of g is 
-3 
(d) r(| 5 \ [A]-353 gl ={(y,x) € [0,00) xR: y = 2}. 
~2 . : ays 
= Verify that g is a function and argue that is not. 
19. T : R? => R? is linear, T}| 0 = : } and - ° . ° 
4 . 24. Justify the claim. 
5 
rl} a |l= -5 Find (a) For any function C € A x B, if (a,b,) € C and 
-1 1 ; (a, b2) € C then b, = bo. 
4 (b) If the mapping L : V > Wis linear, then L(0) = 0. 
Note: the 0 on the lefthand side is the zero vec- 
eee | : ae ie tor of V and the 0 on the righthand side is the zero 
vector in W. They are not necessarily equal. 
15 
(b) TI} -12 (c) The composition of linear transformations is lin- 
3 ear. 
1 (d) A map L : V — W is linear if and only if 
(c) T}} -4 [A]-354 L(x + cy) = L(x) + cL(y) for every x,y in V and 
7 scalar c. 


Answers 


implied domain The implied domain is the set of all inputs that correspond to real number outputs. The function 


fw= 3x47 includes a fraction, a quantity that is undefined precisely when the denominator is zero. Hence we 


must discard all numbers that satisfy 2x — 5 = 0. Solving this equation for x is a simple matter: x = 3 and so 
the implied domain is R\ {3} (all real numbers except 3), 


3n+ 1 problem R(13) = 3(13)+1 = 40 since 13 is odd. R(R(13)) = R(40) = 2 = 20 since 40 is even. R(R(R(13))) = 
R(R(40)) = R(20) = 10, and so on. The sequence n, R(n), R(R(n)), R(R(R(n))), . . . is 


13, 40, 20, 10,5, 16, 8,4, 2, 1,4,2,1,4,2,... 
so it takes 7 iterations to get the first 4. 
two properties For T : R” — R”, T(x) = Mx where M is a matrix, 


1. T(x+y) = M(x+y) = Mx + My = T(x) + T(y); and 
2. T(cx) = M(cx) = c(Mx) = cT(x). 


linear or not? Another two results of calculus are that {f+9) = ff+fe and {(cf) =¢ ie so A(f +g) = A(f)+ 
A(g) and A(cf) = cA(f). Ais a linear transformation. On one hand,/(s+7) = / (so, 51, 52,..-) + (t,t, f2,...)) = 
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1 (So + to, 5] +t, 52 +h,...) = 8, +t, S9+f,.... On the other hand, /(s)+/(t) = I(so, 81, 52,...)+U(to, th, fo,--.) = 
(81, 82,...) + (4,h,...) = 8) +h, 52 + t,.... Therefore (s + t) = I(s) + l(t). Property 1 holds for /. To 
check that property 2 holds, note that [(cs) = l(cso, ¢51, ¢52,...) = C81, CS2,... and cl(s) = cl(so, 51, 52,...) = 
C(S1, $2,...) = C51, CS2,... too. det is not a linear transformation. For example, let A = | : : and B = -A. 
Then detA = det B = 1, so det(A) + det(B) = 2 while det(A + B) = 0. Matrix inversion is also not a linear 
transformation. Using the same matrices A and B, Al= - - | and B7! = - = so the sum of 
. . | 0 0 : ‘ 0 0 
the inverses is 0 0 but the inverse of the sum does not exist. In other words, V(A) + V(B) = 0 0 but 


V(A + B) does not exist, so they are not equal. 


definition of inverse Since a is in the domain of 7, there must be an ordered pair (a, b) in T—meaning T(a) = b. 
By definition (5, a) is then in the inverse of T—meaning T l(b) =a. Putting these facts together, T!(T(a)) = 
T \(b) =a. 
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4.4 Linear Transformations from R” to R” 


Geometric Interpretation of Linear Transformations 


Drawing vectors as arrows, as in section 1.4, gives us a way to picture linear transformations. We can draw any vector 
and its image to help understand the action of the map geometrically. For example, if every vector we draw has an 
image pointing in the same direction but twice as long, that gives us a clear picture of how it transforms vectors. It 
doubles their magnitudes. 

Since vectors do not have an inherent starting location, we can always imagine them starting anywhere. In the 
case of picturing linear transformations, it is helpful to imagine vectors rooted at the origin. Much like plotting points 
in the plane, these vectors are marked off starting at (0, 0). Given this special use of the vector, there is little distinction 
between the point at the head of an arrow and the arrow itself. For this reason, it is just as common to imagine a vector 
as a point in the plane as it is to imagine it as an arrow. 


Consider the linear transformation T : R? > R?, 


rw) =| | 5 


kes ee 


rically these facts are captured by the diagram 


Vv. 


The image of 


; Ecce. 32 1 7 
and the image of| > lis] 1-2 | > |-| 3 Geomet- 


‘ re 
3 3 
2 2 
1 i 
> ailon 2s 4 5 67 8 9 T } 8 9 
el —7 
-2 
3 
4 4 


where the vectors have been represented by arrows rooted at the origin and the T indicates that the change is due to the 
transformation T. The vectors have also been color coded so the image of the brown vector is brown and the image 
of the green vector is green. From just these two sample vectors, it is hard to describe just what the transformation 
does in general. This is where it is helpful to interpret vectors as points. If we color a bunch of points, transform 
each of them, one at a time, giving their images the same color, we get a much clearer picture of the action of the 
transformation. 


Consider again the transformation 


a ee 


me 
v 


but this time imagine the vectors it acts on and their images as points. Coloring all the points in the square with 
opposite corners at (—2, —2) and (2, 2) to manifest as a photo of a coffee mug? and coloring their images accordingly 
is summarized in the folowing diagram. 


3Photo by Dziana Hasanbekava from Pexels 
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The same brown and green vectors as before are superimposed on the picture to help relate back to this interpreta- 
tion. The point at the end of the green arrow is sky blue and lands on the boundary of the picture before transforming, 
so the point at the end of the green arrow is sky blue and lands on the boundary of the picture after transforming, too. 
A large portion of the green vector runs up the side of the coffee mug both before and after transformation. The point 
at the end of the brown arrow is sky blue and lands just next to the handle of the coffee mug before transforming, so 
the point at the end of the brown arrow is sky blue and lands just next to the handle of the coffee mug after transform- 
ing, too. In all, the transformed image is larger than the original, rotated, reflected (the handle is on the left of the 
coffee mug in one picture and on the right in the other), and sheared (the transformed picture covers a parallelogram, 
not a square). These are the words we use to describe the action of the transformation. It scales, rotates, reflects, and 
shears the plane, and objects in it. Actually, these are the only invertible actions a linear transformation can take on 
the plane, as we will see. 


The Matrix of a Linear Transformation 


Suppose a map G : R” X R” is given by G(v) = Mv for some matrix M. Then, by theorem 2 part 5, Gu + v) = 
Miu + v) = Mu + Mv = G(u) + G(v) and by theorem 3 part 4, G(cu) = M(cu) = c(Mu) = cG(u) for any vectors u 
and v and any scalar c. Therefore G is a linear transformation. 

Suppose a linear transformation T : R” — R” is given by matrix multiplication by an m x n matrix A. That is, 
T(v) = Av. Then it is easy to calculate its action on the vectors 


1] PO TO 0 
o}]1]]o 0 
0 x : l ia ? 0 ? 
ollollo 1 


the columns of the n X n identity matrix. Thinking of matrix multiplication of a vector as a linear combination of the 
columns of the matrix, it is clear that 


1 0 0) 
0 1 0 0 

A| ® |=ay, A] 9 | =A, Al ! [=A.3,....4] 9 [=A 
0 0 0 1 


In short, A/.,; = A.;. The j” column of A is the image of the j” column of /. 


On the other hand, suppose you know the images of the columns of J but are not given T as a matrix product. All 
you know is 
TL) = Ci, T(L.2) = C2,..., T(E) = Ch 


for some m x | vectors €;,C2,..., Cy. It turns out this is enough information to determine T, and T can be represented 
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V1 
V2 
by matrix multiplication! Letv =| | | be an arbitrary n x | vector. Due to linearity, 
Vn 
al 
v2 
T(v) =T . 
Vn 
val 0 0 
0 v2 0 
=T +]. d+ + 
0 0 Vn 
1 0 
1 0 
=T|vyy} . | +ve] 2 pte thn 
0 0 1 


= T (yl.y + volig +°++ + Valin) 
= T (yl) + T (vol.2) +2 + T nln) 
= vyT U1) + v2oT (2) +2++ + VT Un) 


= VC] + V2C2 Hees + Vly 
Vi 
v2 
=[a © Gali 
Va 
=|c¢ © -: cy |v. 


In other words, to represent T as a matrix product, form the matrix with columns equal to the images of the columns 
of I. 


These calculations justify the folowing theorem. 


Theorem 10. [The Standard Matrix of a Linear Transformation] Given a transformation T : R" > R", T is 
linear if and only if T(v) = Mv where M. = T(U.j), j = 1,2,...,n. M is called the standard matrix of T. 


In words, a transformation from R” to R” is linear if and only if it can be represented by multiplication by a matrix 
whose columns are the images of the columns of the identity matrix. 


This observaton can be applied immediately to write down the algebraic (matrix) representation of transformations 
with which you may already be familiar. For example, consider reflection about the x-axis in the plane, call it F,. 
Geometrically, this transformation can be illustrated as in the following diagram. 
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The image of J/.; is /.; and the image of J.» is —J.., so if this is a linear transformation, 


It is easy to verify that F, acts on the entire plane (not just the columns of J) in the way expected. 


: vy |. v : v F . , 
so the image of | is | is | ‘ } the reflection of | i | about the x-axis. It must be that reflection about the x-axis 
2 —V2 2 


is a linear transformation. Can you justify this using the definition of linear transformation? Answer on page 146. 


0 -!l 
takes one of the following five forms—swap, scale first row, scale second row, replace first row, replace second row, 
respectively—for some scalar r or s # 0: 


Notice that | ae | is an elementary matrix (scale the second row by —1). Every elementary 2 x 2 matrix 


The following series of diagrams illustrates the types of transformations attainable by multiplication by these elemen- 
tary matrices. Geometrically they are reflection, scaling, scaling with reflection, and shearing. 
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; image under T(v) = Mv Notes 
matrix M 


reflection about the line y = x 


S——— 
a) 


e—= 
oH 
e e 


I. and I... swap places 


wu 
e 


horizontal scale by factor s 


includes reflection about the y-axis when s < 0 


—_—— 
Cou 
a) 

el 


expansion when |s| > 1; compression when 
Is) <1 


uw 
e 


vertical scale by factor s 


includes reflection about the x-axis when s < 0 


—_——+ 
iS = 
a) 
——_—_— 
e 


e expansion when |s| > 1; compression when 


|s| < 1 
-15 
matrix M image under T(v) = Mv Notes 
i % vs e horizontal shear by factor r 
| 01 | e /., is unaffected 


e vertical shear by factor r 


e J.» is unaffected 


As we have seen previously, any invertible matrix can be written as a product of elementary matrices. If there 
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were some connection between matrix multiplication and linear transformations, we would be on our way to a com- 
prehensive characterization of linear transformations from R” to R”. By theorem 2 part 4 A(Bv) = (AB)v. In terms 
of linear transformations, the left side, A(Bv), represents applying the transformation whose associated matrix is B 
first and then applying the transformation whose associated matrix is A to the result. In other words, A(Bv) represents 
composing the two transformations whose associated matrices are B and A. The right side, (AB)v, represents applying 
the transformation whose associated matrix is the product AB. It must be that matrix multiplication corresponds to 
function composition! To facilitate the following discussion, which puts these words into symbols and elaborates, for 
any matrix M we adopt the notation Ty for the linear transformation Ty : R” > R”, Ty(v) = Mv. 


Letting T, be an arbitrary linear transformation from R” to R” and 7; an arbitrary linear transformation from R? 
to R", the following calculation encapsulates the idea that matrix multiplication corresponds to function composition. 


(T'4 © Tg)(v) = T4(Tp(v)) 
= T,(Bv) 
= A(Bv) 
= (AB)v 
= T,p(V). 


This calculation has two important consequences. First, if M is invertible, then Ty 0 Ty-1 = Ty-1 0 Ty = Ty. In other 
words, (Ty ° Ty-1) (v) = (Ty-1 ° Ty) (v) = T;(v) = Iv = v, so Ty is invertible and (Ty)! = Ty-1. Second, if M is 
invertible and we write M as the product of elementary matrices, E, E2--- E,, then 


Ty(v) = Mv 
= (EE) -++Ep)v 


= (Tp, 0 Tp, 0+--0Tx,) (Vv), 


so Ty is a composition of linear transformations whose associated matrices are elementary matrices. 
Finally, because the action of transformations defined by 2x2 elementary matrices include only reflection, scaling, 
and shearing, these actions and compositions of them are the only actions of invertible linear transformations on R?. 
So what about Ty where M is noninvertible? Noninvertible matrices have linearly dependent columns and linearly 
dependent rows. In the case of 2 x 2 matrices that means the rows are multiples of one another or one of the rows 
contains two zeros. Likewise, either its columns are multiples of one another or one of the columns contains two 
zeros. In any case, it has the form 


iad ae 


ka sa 


for some scalars a,b, k, €. Can you verify this form covers all four cases? Answer on page 146. If k #0 
ka 0 1 0 1 0 1 _| ka fa 
0 1 kb 1 0 0 ~| kb &b 


N= 0 


AIS 


andifk =0 
0 O 
ie 0 1 


1 O 1 fa _| 0 fa 
0 tb 0 1 “10 € |’ 
Either way, N can be written as the product of elementary matrices and either 
1 0 = 0 0 
0 0 


0 1 
These matrices are called projection matrices. Their action is to squash (or project) the entire plane onto the x-axis 
or the y-axis, respectively, as shown below acting on 
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as before. The brown line segment in the diagrams below indicates the part of the axis in the image that is not covered 
by the vector. 


matrix M image under T(v) = Mv Notes 


1 0 @ projection onto the x-axis 
1 
| 0” | 05 e /., is unaffected 
15 1 050 05 1 15 e J.» is squashed to the origin 
-15 
| 0 0 _ e projection onto the y-axis 
1 
01 | 0:5 e J.» is unaffected 


I. is squashed to the origin 


far 
°S 
a 
co 
Ln 
e 


These projections complete the characterization of linear transformations from R? to R’, also called linear opera- 
tors on R?. Because every 2 x 2 matrix can be written as a product of the matrices 


0 1 s 0 1 0 1 +r 1 0 1 0 0 0 
1 077; 0 LV}]oO sy} oO LTPyr tyyo oy ]o t1 
and multiplication by these matrices represents reflection, scaling, shearing, and/or projection, we have the following 


theorem. 


Theorem 11. [Characterization of Linear Transformations from R* to R?] A transformation T : R? > R? is 
linear if and only if it is a composition of some sequence of reflections, scalings, shearings, and projections. 


Key Concepts 
geometric interpretation of vectors vectors in R” are often thought of as points. 


matrices and linear transformations from R” to R” a transformation T : R” — R” is linear if and only if T(v) = 
Mv where M.; = TU), j = 1,2,...,n. 
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elementary matrices as linear transformations of the plane the action of a swap matrix is reflection about the line 
y = x; the action of a scale matrix is scaling and possibly reflection; the action of a replacement matrix is 


shearing. 


noninvertible linear transformations of the plane a linear transformation of the plane, Ty, is noninvertible if and 
only if M is the product of a projection matrix with elementary matrices. 


characterization of linear transformations of the plane see theorem 11. 


standard matrix see theorem 10. 
0 = 0 
0 0 


1 
projection matrix (onto an axis in R”) 0 


Exercises 


1. (i) Find the standard matrix of T : R? > R?’; and (ii) find 
the image of | Lt 1 ie 


(a) T shears points so that the image of (1, 0) is (1, - 5) 
while (0, 1) is unaffected. [A]-354 


(b) T reflects points through the origin. 
(c) T scales horizontally by a factor of 3. [A]-354 
(d) T projects onto the y-axis. 


2. Argue that even though the transformations in exercises 
1b and 13c have different geometric descriptions, they 
are the same transformation. 


3. Find the standard matrix of T : R? > R?. 


(a) T reflects points about the y-axis and then scales 
them both vertically and horizontally by a factor 
of 2. [S]-308 

(b) T scales points vertically by a factor of 5 and then 
shears them vertically by a factor of i. 


(c) T reflects points about the x-axis and then shears 
horizontally by a factor of 0.77. 

(d) T dilates points horizontally and vertically by a 
factor of a and then shears horizontally by a factor 
of 3. [A]-354 


(e) T shears points vertically such that [ 1 0 l' 


maps to [ 1 1.44 | and then reflects points about 
the origin. 


(f) T projects points onto the x-axis and then shears 
them vertically by a factor of 2. [A]-354 


(g) T shears horizontally by a factor of 0.76 and then 
dilates points horizontally by a factor of a. [A]- 
354 


4. Are S and T the same transformation? 


(a) S rotates points (counterclockwise) about the ori- 
gin by 45°, reflects them about the x-axis, and then 
rotates them clockwise about the origin by 45°. 

T reflects points about the line y = —x. 


(b) S reflects points about the y-axis and then rotates 
them 7 radians clockwise about the origin. 
T rotates points x radians clockwise about the ori- 
gin and then reflects them over the x-axis. [S]-309 


0 
1 


(c) 


(d) 


S shears points vertically by a factor of 2 and then 
scales them both horizontally and vertically by a 
factor of i. 
T scales points by horizontally and vertically by 
a factor of 5 and then shears them vertically by a 
factor of 2. 


S reflects points about the x-axis and then reflects 
them about the y-axis. 
T rotates points 180° about the origin. [A]-354 


5. Find the standard matrix for the linear transformation 


T : RR" > R”" such that 


0 
0 
-11 
war >| 45 | 
1 
-7.5 
1 1 0 
lo) = | -no]m rr) - 
-3.7 
—5.2 
10.1 
_43 . [S]-309 
4.3 
17 -87 
1 0 
—94 67 
ol] fella] Sf 
-30 143 
—45 
0 
-129 
| || a 
—33 


ors) [Bl 0D - 


132 
98 
-77 


. [A]-354 
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1 0) 
0 -7 1 -14 
@ T|| ° Sle hell - [4 

0 0 
0 0 
0 -4 0 -15 

i 1 -| Tl jana r 0 -| “6 | 
0 1 


6. Find a sequence of elementary and/or projection matri- 
ces whose product is the standard matrix of the linear 
transformation T : R? > R?. 


(a) T reflects points about the y-axis and then projects 
them onto the x-axis. 


(b) T dilates points vertically by a factor of u and then 
shears them horizontally by a factor of 1.62. [A]- 
354 


(c) T projects points onto the y-axis and then dilates 
them horizontally and vertically by a factor of 2. 


(d) T reflects points about the x-axis and then shears 
them horizontally by a factor of 1.27. [A]-354 


(e) T shears points horizontally such that [ 0 1 l' 


maps to [ 0.08 1 |’ and then reflects them 
about the x-axis. 


(f) T contracts points horizontally and vertically by 
a factor of 3 and then reflects them about the y- 


axis. [A]-354 


7. Though it is not strictly necessary, you may find Sage- 
Math helpful. Find the standard matrix for the linear 
transformation T : R” — R” such that 


41 


«) ORIED | =| - [3] 
2 ~ 
-6 -5 
| 1 = ea and T}} -l | = 
0 1 
ee } [S]-309 
-8 
») OHEEED (| ',)) - | 5 | a 
6 
2 
7 -5 
r([ 20 =| 3 
-4 
-46 a 
«) ORIED | | | 5 pol 
-95 7 
8 


12) Je 


. [A]-354 


» OLED ,( || 


3 
4 -9 
: 
(e) © T 9 | 1 | 
2 
-6 5 
14 5 -5 6 
PY ost -| She 16 -| Sh 
-5 3 
1 
-3 9 
and T 3 -| 2 } [A]-354 
1 
8. Argue that the transformation S RP > R, 


S ([ x y I) [ ry l' is not linear by the fol- 
lowing steps. 
(a) Find the images of /., and J. under S. 


(b) Form the matrix M whose columns are the vectors 
from part 8a. 


(c) Find a vector v such that S(v) # Mv. 


. Argue that the transformation S : R? > R’,S : | 
2 3 
-4 1 


steps. 
(a) Find the images of /.,, and /.2 under S. 


x 


. | + | = | is not linear by the following 


(b) Form the matrix M whose columns are the vectors 
from part 9a. 


(c) Find a vector v such that S(v) # Mv. 


10. Show that reflection about the origin of the plane, which 


maps | : | to | 7 | is linear by finding a matrix M 


such that | i |-| “ee } [A]-354 
y -y 


11. Show that F,, : R? > R2, Fy, }-| 2 | etc 


tion about the origin of the plane) is linear by showing 
that F,,(u + v) = F,,(u) + F(v) and F,,(cu) = cF,,(u) 
for any vectors u and v and scalar c. 


Rotation. The remaining exercises explore rotation as a com- 
position of elementary matrices and as a linear transformation 
itself. Each question after number 12 depends on the results of 
12. 
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12. Argue that rotation in the plane is linear and write it alge- (c) T rotates points clockwise about the origin by % 
braically as matrix multiplication by the following steps. radians and then reflects about the line y = —x. 
Let Ry denote rotation through angle @ counterclockwise 
about the origin. [S]-310 15. Find a sequence of elementary matrices whose product 
is the matrix derived in exercise 12d. Hint: try a product 


(a) Argue geometrically that Ry is a linear transfor- 
mation. Your argument need not be a proof, just 
enough reason to believe Ry is (likely) linear. c 0 | 1 0 | 1 0 | 1 ¢ | 


(b) Find the images of /.; and J.. under Ry. 0 1 sol On 0 1 


of the form 


c) Write Rg as a matrix product using the results of 
p g ; ' ' 
12b 16. Reflection about lines other than the axes are linear trans- 


formations too and can be realized by a composition of 


(d) Demonstrate algebraically that the matrix derived rotations and reflection. To demonstrate, derive the stan- 


in part 12c affects rotation through angle @ about 3 
dard matrix for reflection about the line y = aq ac- 


the origi bit tor| > |. This sh 
e origin on an arbitrary vector . This shows : : 
§ y y cording to the following steps. 


that rotation is properly represented by matrix mul- 
tiplication, making it a linear transformation by (a) Pick a point (any point) P not on the line. 


theorem 10. (b) Calculate the sine of the angle the line makes with 


13. (i) Find the standard matrix of T : R? > R?; and (ii) find eeemrergy (ia (=)) 
r : : 
the image of | 1 1 | : A 


(a) T rotates points counterclockwise about the origin (c) Calculate the cosine of the angle the line makes 


é 36 
by a radians. [S]-312 with the x-axis, cos (ian (=) 
(b) T rotates points clockwise about the origin by 7 
radians. (d) Rotate point P counterclockwise about the origin 


by angle tan7! 38 . HINT: the answers to parts 


(c) T rotates points counterclockwise about the origin iicund Se ate usctull here: 


by 180 degrees. [A]-354 


id. Find the standard matixof T+? = RZ. (e) Reflect the point from part 16d about the x-axis. 


(f) Rotate the point from part 16e clockwise about the 
origin by angle tan"! (38). 


(a) T rotates points (counterclockwise) about the ori- 
gin by 3 radians and then reflects them about the 


y-axis. The product of the matrices (that you hopefully used) in 
(b) T rotates points (counterclockwise) about the ori- parts 16d, 16e, and 16f is the standard matrix. Graph the 

gin by x radians and then projects them onto the line and point P and the point from part 16f on the same 

y-axis. [A]-354 set of axes to see that you have calculated correctly. 


Answers 
reflection about x-axis Using F,. as the name for reflection about the x-axis, note that F,. (| - | +| : = 
2 2 
+ + + 
F, uy + Vi = uj + Vj = uj + Vi 7 al 4 val = F, uj +F, V1 and 
uz + v2 —(u2 + V2) U2 — V2 U2 V2 uz V2 


roa =C a | = oF, ie }) This shows that F,.(u + v) = F,(u) + 
2 


F(el  )=F(] oe =| Se [= e[ 2 


F,(v) and F,(cu) = cF,(u) for any vectors u and v and scalar c, so F, is linear by definition. 


four cases e rows are multiples of one another: if a and b are nonzero, then N,; = ND: and oN 1: = No; 
e one of the rows contains two zeros: a = 0 or b = 0 while k and € are arbitrary 
e columns are multiples of one another: if k and € are nonzero, then N.; = EN.» and £N.1 =N.2 


e one of the columns contains two zeros: k = 0 or € = 0 while a and D are arbitrary 
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4.5  Isomorphisms 


The word vector has been used in a number of different ways in this book. In section 1.3 the word vector was said to 
have the understood meaning from physics or calculus and represented using the angled bracket notation (x, y). n Xx | 
matrices were called column vectors, or just vectors, and were said to represent vectors despite being different objects. 
The calculus/physics idea of a vector was brought to life geometrically in section 1.4 when a vector was represented 
by an arrow with both magnitude and direction. In section 4.2 it was noted that vectors in a vector space have unique 
representations as linear combinations of basis vectors. Most recently, vectors (with tails at the origin, in section 
4.4) were represented by points. In all instances, these were representations of vectors, not vectors outright. Only in 
section 4.1, where the word vector was used to refer to any element of a vector space, did we have a definition. To be 
clear, this is the one and only definition of vector. All other uses will have to be justified from within this umbrella. 
By definition, R” is the set of all ordered lists of n real numbers. It is not, on the surface, a vector space at all. 
Elements of R” are therefore not inherently vectors! It is only once addition and scalar multiplication are defined 
(and adhere to the ten properties outlined in section 4.1) that R” becomes a vector space. When nothing is said to the 


contrary, addition and scalar multiplication in R” = {r},7o,...,% 17; € R,i = 1,2,...,n} are understood to be defined 
element-wise. That is, for any elements 71, 72,...,7, € R” and 5), 52,...,5, € R" 
T15125-005Tn + S15 5825-005 5Sn = VY +S], 72 + $2,..-5Tn + Sy 
and for any element 7), 72,...,7, € R” and scalar c € R, 
CX PM, %2,---5%n = CVI, C12,...,Clp- 


These definitions should remind you of the definitions of matrix addition and scalar multiplication, which are 
defined entry-wise. For n x | matrices, addition and scalar multiplication are defined as follows. For any elements 
ry Sy 


i) 52 
€ Mnxi(R) and} | | € Myx (R) 
ln Sn 
Tr Sy ry + Sq} 
'2 AY) 12 + $2 
+ = 
ln Sn Tn + Sn 
ry 
r2 
and for any element} | | € M,,,(R) and scalar c € R, 
Tn 
r\ Ccr{ 
lip) cr 
Cc = 
Tn Crp 


So what is the difference between elements of R” and elements of M,,.,(R)? Functionally there is no difference! 
There is no way to distinguish elements of R” and elements of M,1(R) based purely on their properties. Each 
ordered list of real numbers 71, 72,..., 7, could just as easily be written as a column matrix 


r| 
ip) 


Tn 
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Figure 4.5.1: Isomorphism Path 


f e g 
{Pin Fa; te Pah a a ——— {[r T2 rn) } 
Tn 
ue h 
{ir ist Tn)} Pig Pa <5) 
1,172) ln 
. k 
Pp 
£ 


{ril4 + 7roI.o4+-- -+ryln} ——$—$—<—$—$—<—<—————————————__ 


NOTE: 7: € R,t=1,2,...,n 


and vice versa. The sum of two ordered lists of real numbers could just as easily be written as a sum of two column 
matrices and vice versa. Each scalar multiple of an ordered list of real numbers could just as easily be written as a 
scalar multiple of a column matrix and vice versa. When two sets are interchangeable in form and function, we say 
they are isomorphic. 

Formally, two sets are isomorphic if there exists an isomorphism between them. What defines an isomorphism 
depends on the structure of the sets. A vector space is a set endowed with two operations. The set defines the 
elements of the vector space and the operations define the structure. An isomorphism between vector spaces maps 
each element of one vector space to exactly one element of the other without missing any and preserves vector addition 
and scalar multiplication. Such an isomorphism can be understood as the mathematical formalization allowing the 
free flow between one representation of a vector and another. It supplies the rigor behind using row vectors, column 
vectors, ordered lists, arrows, points, linear combinations, and vectors in the sense of calculus or physics as if they 
were all the same thing. For example, the map T : R” — M,1(R), 


ry 
2 
T(1,12,+++5Tn) = 


Tn 


is an isomorphism. Can you verify this claim? Answer on page 150. To complete the formalism, the following 
definitions are introduced. A map T : A > Bis called 


1. onto if for each b € B the equation T(a) = b has at least one solution a € A. 
2. one-to-one if for each b € B the equation T(a) = b has at most one solution a € A. 


If A and B are vector spaces, then T is an isomorphism if it is one-to-one, onto, and linear. Being one-to-one and onto 
assures “each element of one vector space corresponds to exactly one element of the other’ and being linear assures 
it “preserves vector addition and scalar multiplication”’. 

When we use the various representations of elements of R” interchangeably, we are relying on the existence of an 
isomorphism from each one to each other. Much like showing that a list of statements are equivalent by showing a path 
of implications from any statement to any other, this can be done by showing a path of isomorphisms from any vector 
space to any other. This is because the composition of isomorphisms is an isomorphism. Can you justify this claim? 
Answer on page 151. See figure 4.5.1. Once isomorphisms f, g,h,k, €,m, p,q between the sets are demonstrated to 
exist, each vector space is isomorphic to each other by composition. For example, g : Myx1(R) — Mixn(R) defined 
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by 


Tn 


is an isomorphism. Can you justify this? Answer on page 151. 

Maybe more surprising is the claim that all n-dimensional vector spaces over the real numbers (those defined for 
real number scalars) are isomorphic. In different words, we might say up to isomorphism, there is only one vector 
space over the real numbers. Different vector spaces may look different and contain different objects, but they all 
have the same structure and therefore are interchangeable. This claim can be proven by leaning on the fact that an 
n-dimensional vector space has a basis with n elements. 

Let V and W be n-dimensional vector spaces. By definition, each has a basis with n elements. Let B = 


{V1, V2,---, Vn} andC = {wj, W2,..., Wn} be bases for V and W, respectively, and define 
f: VOR’, Sf) =r1,12,-..-, 1% where V = rV, + r2V2 +°++ + TnVn 
g: WR’, g(W) = 51, $2,..-, 8, Where W = 5; Wy + SyW2 + +++ + S,Wh 


Since the expression of an element of a vector space as a linear combination of basis vectors is unique, f and g are 


well-defined (they are actually functions, not simply relations). Given an arbitrary element r = 7, /2,...,7, of R", 
let V = rjVy + V2 +--+: +7,V,. Since vector spaces are closed under linear combinations, v is in V. Furthermore, 
F(V) = 11, 172,.--;% $0 f is onto. Now suppose there is a second element u in V such that f(u) = 11,72,...,7,. By 


definition of f it must be that u = 7,Vvj + 2V2 +--+: +7,V, SOU = Vand f is one-to-one. Now let u and v be arbitrary 
elements of V, and write u = 7;V, +72V2 ++°++7)V, and V = 5,V) + SoV2 +: +++ 5,V,. Letting c be an arbitrary scalar, 


f(atcv) = f(TiV1 + r2V2 +2 °+ + TpVn + C(S1V1 + S2V2 +°°* + SaVn)) 
= f(rVy + 2V2 $255 + yV_ + CS{V] + CS2V2 + +++ + C5nVn) 


Sr + csv + (r2 + CS2)V2 +++ + (Tp + CSn)Vn) 


=, + CS1,%2 + CS2,...,% + CSp 
=171,12,...5,!n + CS1,CS2,...,CSy 
=71,12,..-,!m +CX S1,52,...,5Sy 
= f(u) + cf(v) 


so f is linear. Because f is one-to-one, onto, and linear, f is an isomorphism. By similar argument, g is also an 
isomorphism. By exercise 13e g~! is an isomorphism. Since the composition of isomorphisms is an isomorphism, 
g of :V— Wis an isomorphism. 

Key Concepts 

onto amap T : A > B such that for each b € B the equation T(a) = b has at least one solution a € A. 

one-to-one a map 7 : A — B such that for each b € B the equation T(a) = b has at most one solution a € A. 
isomorphism a one-to-one, onto, linear transformation between vector spaces. 

isomorphic vector spaces between which there exists an isomorphism. 


composition of isomorphisms is an isomorphism. 


Exercises (c) p: R= [0,00), p(x) = 2 
1. Which functions are one-to-one? [S]-312 (d) h: [0,00) > R, A(x) = yx 
(a) f: ROR, f(@) = sin(x) (©) g: ROR, g() =3x-9 


(b) g:R— (0,0), g(x) = e* 2. Which functions are onto? [S]-312 


5. Which of the requirements of isomorphism does T : 


BR > P(R),T([ 1m 73 |) = @—n)G—1)(x-15) 
fail to satisfy? 


(a) T is one-to-one 


<8 


(d) T 


N 


13, -15 -9 
0 -3 10 


(e) T 
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(a) f: ROR, f(x) =-28 P 3x — 4y 
T =| -x+3 
b) q@:RIORgx=eH1 . (|; kay 
(c) p: [0,00) > R, p(x) = x . 
(d) g: [0, 4] — R, g(x) = cos(x) (b) | , | _ 2x — 3y+7z 
- -2 
(e) A: [0,00) —> [0, 00), h(x) = Vz z a a 
3. Which of the functions in exercise | are onto? 4 14 -l ' 
4. Which of the functions in exercise 2 are one-to-one? oe ( y - _ fe | y | [S}-313 


(b) T is onto 


(c) T is linear 


-5 «(ll x 
sD-[t Ife] = 


ody -[as a |] 


< om 


6. Which of the requirements of isomorphism does T : 
RY > RY, T (51, 52, 53, 54, 55,---)) = 51, 82, 83, 84 fail to : : e rene? 
satisfy? [S]-313 11. For what type of matrix M is Ty : R” > R” an isomor- 
phism? 
a). ‘ veers 12. It is claimed that the vector spaces in figure 4.5.1 are all 
(b) T is onto isomorphic, and a formula for isomorphism f is provided 
(c) T is linear in the text. Provide a formula for the isomorphism 
7. Which of the requirements of isomorphism does T : (a) g 
C > M)x2(R), T (a + bi) = ; | fail to satisfy? (b) A [A]-354 
. (c) m 
(a) T is one-to-one 
d A]-354 
(b) T is onto Ce a 
(c) T is linear (©) 4 
8. Is the transformation det : M2,2(R) — R 13. Justify the claim. 
(a) linear? (a) the statement “Ty : R” — R” is one-to-one” may 
(b) : ‘ be added to the list of equivalent statements of the- 
ogee orem 5. [A]-354 
to? 
(c) = ; (b) the statement “Ty : R’ — R” is onto” may be 
(d) an isomorphism? added to the list of equivalent statements of theo- 
9. Is the transformation T : C({0, 1]) > C({0,1]), T(f) = rem 6. 
e*f(x) [S]-313 (c) the statements “Ty : R” — R” is one-to-one” and 
(a) linear? “Ty : R — R" is onto” may be added to the list 
of equivalent statements of theorem 7. [A]-354 
(b) one-to-one? 
(d) If f : V — W is an isomorphism between vector 
(ont spaces V and W, then f is invertible. 
i ism? 
ea) ah SOD (e) If f : V — W is an isomorphism from vector 
10. Is the transformation T : R” — R"” (i) one-to-one? (ii) space V to vector space W, then f~! is an isomor- 
onto? (iii) an isomorphism? phism. [A]-354 
Answers 


first isomorphism An isomorphism between vector spaces maps each element of one vector space to exactly one 
element of the other without missing any and preserves vector addition and scalar multiplication. The map 
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TR" > MixiR), 


ry 
r 
T(r, 12,.--5!m) = . 
Tn 
does just that because 
r| 
r2 
1. the arbitrary element r),r2,..., 7%, € R” maps via T to (the specific element)} . | € Mj, x;(R) and only 
Tn 
ry 
r2 
. |,$8o0 7 maps each element of R” to exactly one element of M,,.;(R). 
Tn 
ry 
ry 
2. (the specific element) r}, 72,..., 7% € R" maps via T to the arbitrary element} . | € M,x\(R), so T does 
ln 
not miss any elements of M,,;(R). 
ry + Sq} r| Sy 
12 + $2 r2 So 
3. Tr, 125-605 Tn + S815 825-0255) = Try + 81,72 + S2,.--5%_ + Sy) = . = ttt. = 
Th =F Sn Th Sn 
T(r, 12,---5%n) + T(S1, S2,.-+, Sn), SO addition is preserved under T. 
cr{ r\ 
cro i) 
4. T(c X r,r,.--,%m) = T(cn, cra, ..., Cn) = F =c| . |=cT(n,%,..-,%m), so scalar multiplica- 
CT pn ry, 


tion is preserved under T. 


composition of isomorphisms Let A, B,C be vector spaces and T : A — Band S : B > C be isomorphisms. We 
need to show that S o T : A > C is an isomorphism. 


1. Let c be an element of C. Then because S is onto, there is at least one b € B such that $(b) = c. Let b be 
such a solution. Because T is onto, there is at least one a € A such that T(a) = b. Let a be such a solution. 
Then S o T(a) = S(T(a)) = S(b) = cso S 0 T(a) = c has at least one soution and S o T is onto. Generally, 
this shows that the composition of onto mappings is an onto mapping. 

2. Suppose S o T(a,) = c and S o T(az) = c. Equivalently S(T(a,)) = c and S(T(a)) = c. But S is 
one-to-one, so the equation S(b) = c has at most one solution. Therefore, T(a,) = T(a2) = b for the same 
b € B. Since T is one-to-one, the equation T(a) = b has at most one solution. Therefore a, = a2, which 
shows that for each c € C, the equation S o T(a,) = c has at most one solution and S o T is one-to-one. 
Generally, this shows that the composition of one-to-one mappings is a one-to-one mapping. 

3. In exercise 24c of section 4.3 you are asked to show that the composition of linear transformations is 
linear. This completes the proof. 


isomorphism g Let g : Mnxi(R) — Mix,(R) be defined by 


r| 
io) 


ln 
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Then 
1. Given] mom +++ tm | in Mixn(R), 
ry 
2 
8 ; = [ ry 12 Th | 
ln 
so g is onto. 
2. Given r= [ ry 2 +t: Th | in Mj),,(R), suppose g(u) = r and g(v) = r. Then 
ry 
r2 
u=| . |=v 
Tn 
so g is One-to-one. 
T T 
3. Letx= [ Xj Xo «t+ Xp | and y = [ Yr yo ctt Yn | be in M,,x;(R) and c be a scalar. Using the 
result of exercise 24d of section 4.3, the following calculation shows the linearity of L. 
x] JI 
X2 y2 
L(x+cy)=L . tte]. 
Xn Yn 
xX, + cy, 
X2 + Cy2 
=f; . 
Xn + CVn 
7 [ Xp+ecyy X2+Cy2 +++ Xn + CVn | 
=| a x2 °°" Xn [tel y1 YQ 03 Yn | 


= L(x) + cLQy) 
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4.6 Inner Product Spaces 


Section 4.5 ended by showing that all n-dimensional vector spaces are isomorphic. That means, for example, R°, 
Ps5(R), Mb 3(R), and the vector space generated by G = {sint, cos f, sin 2t, cos 2f, sin 3t, cos 3f} are all isomorphic. 
The essential ingredient of elements of each space is an ordered list of six real numbers. Elements of any of these 
vector spaces can be represented by such a list. Addition and scalar multiplication in one can be accomplished just 
as well in another. Representation, addition, and scalar multiplication are not enough to distinguish one from the 
other. Yet it might be nice to be able to draw distinctions between them. After all, they are different types of objects. 
Polynomials and elements of G are functions. A list of real numbers is just a list. A matrix is yet a different type of 
object. 

One important feature of R” not included in the definition of a vector space is the dot product. We have seen that 
the dot product allows us to define the magnitudes and orthogonality of elements of R”. If we could extend the idea 
of the dot product to any vector space in such a way that it may not be preserved under one-to-one and onto maps, it 
might prove a useful distinguishing feature. To do that, though, we would need to write down the essential features of 
the dot product and hope that, as an abstract list of requirements, other vector spaces would admit similar operators. 
We explore some properties of the dot product presently. 

First, representing elements of R” by column vectors, the dot product of u, v € R” is defined by uv (see section 
1.3). Letting u = uw, u2,...,u, and V = v1, V2,...,V,, the dot product of u and v, which we will begin to denote u- v, 
is given by 

UsV=WVy + Ugv2 +°°* + UyVy. (4.6.1) 


This is not a new definition, just a new notation and a general formula for computing the dot product. Also in section 
1.3 the magnitude of a vector u is defined by Vu'u. Using the new notation and formula (4.6.1), 


[ul] = vu -u= Vu uy + UQUy +++ FUnUy 


a ne Seen 
= qfut tu5 +--+ U2, 


a formula that only makes sense as a magnitude since u-u is nonnegative (the squares of real numbers are nonnegative 
and the sum of nonnegative numbers is nonnegative). If u- u were sometimes negative it would not make a good 
quantity for defining vector magnitude. As such, the nonnegativity of u-u is an important property of the dot product. 

Second, in section 1.4 it was shown that u-v = v-u, meaning that the dot product is commutative. Commutativity 
(or lack thereof) is a fundamental property of any operator. Since the dot product is commutative any abstraction of 
the dot product should be commutative too. Note that both nonnegativity (of u-u) and commutativity (of u- v) follow 
directly from properties of the real numbers. 

Two other properties, distributivity and a type of associativity, of the dot product follow directly from properties 
of the real numbers as well. You may have explored these properties in exercises 4 and 8 of section 1.3. To recap 
using the new notation, 


(u+W)-V=U-vV+W-v 
and 


(cu): Vv = c(u-v). 


Can you show these identities are true using properties of real numbers? Answer on page 157. You might think of 
these properties as a kind of linearity. Since they follow as a natural consequence of using real number scalars, any 
abstraction of the dot product should have a similar distributive and associative properties. 
Fifth, a property that cannot be proven from the previous four but is nonetheless a direct consequence of properties 
of the real numbers is that 
u-u = Oif and only ifu = 0. 


That is, the statements “u-u = 0” and “u = 0” are equivalent. The dot product can be used to determine whether 
a vector is the zero vector. Can you argue using properties of real numbers that the equivalence is true? Answer on 
page 157. 

These five features of the dot product form the foundation for a useful abstraction. We define an inner product 
on a real vector space V to be any operator (,) : V x V > R such that 


1. (u,u) = 0 for all uin V (nonnegativity) 
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2. (u,u) = Oif and only if u = 0 (zero vector identity) 

3. <u, V) = (v, u) for all u, v in V (commutativity) 

4. (u+ Ww, Vv) = (u, Vv) + (w, V) for all u, v, w in V (preservation of addition) 

5. (cu, V) = c(u, V) for all u, vin V and all scalars c (preservation of scalar multiplication) 


The dot product on R” as defined in section 1.3 is the canonical, and motivating, example of an inner product. Any 
real vector space on which an inner product is defined is called an inner product space. 

Extending the ideas of magnitude, distance, and orthogonality to an arbitrary inner product space is a simple 
matter as the following chart suggests. 


in R” | in an n-dimensional inner product space 
norm* ljul| = Vu-u lull = vu, u) 
distance? d(u, v) = |lu— vl d(u, v) = |lu— yIl 
orthogonality® u-v=0 (u,v) =0 


“replacement for the word magnitude in an arbitrary inner product space. 

»see exercise 2 in section 1.4 

“calculation (1.4.1) proceeds identically if each dot product is replced by an inner product. 
We use the words norm instead of magnitude and orthogonal instead of perpendicular because magnitude and perpen- 
dicular have special, visual, geometric meaning in R? and R? that does not readily transer to other contexts. Objects 
such as matrices and functions, and even vectors in R” for n > 3 cannot be visualized the same way. What would it 
mean for two polynomials to be perpendicular, for example? What would be the geometric magnitude of a matrix? 
The familiar geometric notions of magnitude and perpendicularity of vectors in R? and R? have no geometric analogy 
in other vector spaces such as P2(R) and Minx,(R). No matter. This is really the purpose of the above chart. Or- 
thogonality (the abstraction of perpendicularity) of two vectors is defined by the requirement that their inner product 
be zero whatever that may look like geometrically. Therein lies the power of mathematical abstraction. A notion 
such as perpendicularity, which we can clearly see and grasp in R”, can be extended to other sets where no analogous 
picture can be drawn. Likewise the norm (abstraction of magnitude) of a matrix M is defined by the quantity V(M, My 
whether it has geometric meaning or not. 

Can you verify that (,) : Po(R) x P2(R) > R, 


2, 2 2 2 
(po + pix t pox", qo + qix t+ qox’) = 5 P2d2 + 3 Pog2 + Pid + 3 P240 + 2P040 (4.6.2) 


is an inner product? If you have taken calculus, this is equivalent to (p,q) = f : P(x)q(x) dx. Answer without using 
calculus on page 157. 


Key Concepts 
inner product an operator (,) : Vx V > R ona real vector space V such that 
1. (u,u) > 0 for all uin V 
2. (u,u) = Oif and only ifu = 0 
3. (u,v) = (v,u) for all u, vin V 
4. (u+w,v) = (u, v) + (w, v) for all u, v, w in V 


Nn 


. (cu, V) = c(u, V) for all u, v in V and all scalars c 
inner product space a vector space endowed with an inner product. 


norm extension of the idea of magnitude in R? or R? to vectors in any inner product space. The norm of a vector v 
is denoted ||v|| and is calculated as ||v|| = Vv, v). 


distance extension of the idea of distance in R? or R* to vectors in any inner product space. The distance between 
two vectors u and v is denoted d(u, v) and is calculated as d(u, v) = ||u — v\|. 


orthogonal extension of the idea of perpendicular in R? or R? to vectors in any inner product space. Two vectors u 
and v are said to be orthogonal if (u, v) = 0. 
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Exercises 
1. Verify that the operator is an inner product. 


(a) 4): R xR OR, 


u v 
" |, . |) = 200 +300 


[S]-314 
(b) 4): R xR OR, 


(appt pete els sy 


(c) (,) : Po(R) x P(R) > R, 


(p.4) = p(O)gO) + pq) + p(2)q2) 
(d) (,): P2(R) x PR) > R, 


(p,q) = p(O)qO) + pq) 
+ p(2)q(2) + p(3)q@3) 


(e) (,): C((0, 1]) x C([0, 1]) > R, 
1 
(f,8) = { F(x)g(x) dx 
0 
(f) (,) : C(O, 2z]) x C([0, 277]) > R, 
1 20 
(f,8) = = f(x)g(x) dx 
T Jo 


(g) () + Mae(R) xX M.2(R) > R, 
(M,N) = (MN‘),, + (MN*)o9 
2. Using the inner product of question 1b, calculate 
@ ({-6 -1]'.[2 6J} 
| > 0 I) 


5 I’) [A]-355 


= 
S 
~— 
aula 
—S OO 
lon 
| 
w 
a 
4 
— 
w 


(6 || -1 6 J] tarsss 
3. Using the inner product of question 1d, calculate 
(a) (2+ 5x+6x°,-2-3x- x7) [A]}-355 
(b) (-3 +x2,1—2x4 43°) 
(c) (3 — 4x - 3x?,-24+1- 52°) 
(d) ||2-2x+ "|| [A]-355 
(e) ||-4- all 
(f) ||-5 + 4x - 32>|| 


4. Using the inner product of question If, calculate 


(a) ce 2- x) 

(b) (x +10, 45) [A1-355 
(c) (cos x, sin x) 

(d) ||cos x|| [A]-355 

(e) |lal| 

(f) {|S + x - x) 


5. Using the inner product of question 1g, calculate 


-9 0 5 6 
(3 38 Spm 


3 o|{7 ol} 


(e) 


9 -9 
ay) 


0 -3 
offs >| 


6. Using the inner product of question La, find the distance 


between 


(a) [ -12 2 


T 


| and | -13 5 |’ 
(b) | -8 11 | and[ -12 13 | 
() [9 -6 ]' and[ -5 -9 |’ 
[8 -4 J ana[ 14 3 |’ [A}-355 


7. Using the inner product of question Ic, find the distance 


between 
(a) 3—3x+4x? and —5 + 6x + 5x? 
(b) 4x — 3x? and —4 — 5x — 6x? 
(c) 14+ 3x—2x? and 6+ 5x—- x? 
(d) 6+ 2x - 6x? and —5 — 3x — 2x? [A]-355 


8. Using the inner product of question le, find the distance 


between 
(a) x and x? [A]-355 
(b) x and e* 
(c) sinax and cos 2x 
(d) 2x and 5 


9. Using the inner product of question |g, find the distance 


between 

@| 3, y | ana] > | 
-4 

© [|e] 

© | - &, | ana e =| 

@[ 4 5 m2 
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10. In R? with the inner product of question 1b, are u and v 


orthogonal? 
(a) u=[3 10 |’ andv=[ 10 -5 ]' [S}-314 
(b) u=[-1 -10 |’ andv=[-6 1] 
() u=[9 5 | andv=[7 -20 |’ 
(d) u=[-8 4 ]' andv=[ -3 -10 |’ 


(e) u=[6 4] andv=[ -3 5 | 


11. In P,(R) with the inner product of question 1d, are u and 
v orthogonal? 


(a) u = x(x — 1) and v = (x - 2)(x - 3) 

(b) u = x(x —2) and v = (x+ 1)(x-3) 

(c) u=(x— 1)(x— 3) and v = x* — 2x [S]-314 
(d) u = 2x? — 6x and v = x* —3x+2 

(e) u=x* —2x-3 andv = 3x? -3 


12. In C({0, 27]) with the inner product of question If, are u 
and v orthogonal? 


(a) u=sinx and v=cosx [S]-314 
(b) u =3andv=2- 2x 

(c) u=sinx and v = sin2x 

(d) u=e* andv=-x 


23 


and v = x- - 


_ 4 
() U= a5 


13. In M),.(R) with the inner product of question 1g, are u 
and v orthogonal? 


(a) u-| ] : Jana v=| e = | [S]-314 
wy u=| 7! i fanav=| 5 a 
@u=| 7} a Janav=| | 
(d) u-| - 3 |mmav-| . =, 
@u=|¥ 75 Janay =| 3 7 


14. Describe all vectors orthogonal to 


(a) [ 3.5 i in R? with the inner product of question 
la. [A]-355 


(b) x(x—5) in P,(R) with the inner product of question 
le; 

(c) f(x) = 1 in C({0,1]) with the inner product of 
question le. [A]-355 


(d) : i | in M),2(R) with the inner product of 


question lg. 


15. Describe all vectors orthogonal to 


(a) [ 3.5 l' in R? with the inner product of question 
Ib. 


(b) x(x — 2)(x + 3) in P2(R) with the inner product of 
question Id. 


16. Suppose for vectors u,v, w of an inner product space, 


17. 


18. 


(u, Vv) = 3 and (u, w) = i Use this information to com- 
pute 


(a) (u, 3v) 

(b) (u,v + 2w) [A]-355 
(c) (-2v, u) 

(d) (3w,2u) [A]-355 


For what values of a and D is the operator (,) : R?xR? > 


uy v1 = 
(| ss |, in |) =aum + burv2 


an inner product? 


Explain why (,) : P2(R) x P2(R) > R, 


(p,q) = p(0)q(0) + pq) 


is not an inner product. [A]-355 


19. Justify the claim. 


(a) For any vector v in an inner product space, (0, v) = 
(v,0) = 0. [S]-315 


(b) For any scalar c and any vectors u and v of an inner 
product space, (u, cv) = c (U,V). 

(c) For any any vectors u, v, and w of an inner product 
space, (u, V + W) = (u,v) + (u,w). [S]-315 

(d) For any vectors u and v of an inner product space, 
llu + vil? — lw — vi? = 4 (u,v). 


e) For any orthogonal vectors u and v of an inner 
'y g 
product space, ||u — v||* = |[ull? + |IvII?. 
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Answers 
dot product identities Let u = uw), u2,...,Uy,, V = V1, V2,--->Vn, W = W1,W2,---,W, be arbitrary elements of R”. 
Then 
(U + W)*V = (U1, U2,...,Un + W1,W2,-+-5Wn) * Vi, V29-+05Vn 
= Uy +Wj,U2 + W2,...,Un + Wn * V1,V2,.--5Vn 
= (uy + W1)V1 + (U2 + W2)V2 + ++ + (Un + Wan 
= U,V, +WyVy + UQV2 + WoV2 +2 + + UyVyn + WrVn 
= (UyVy + U2V2 +++ FUnVn) + (WV, + WoV2 + +++ + Wan) 
=u-v+w-v 
and 
(cu)*V=(C X Uy,Uo,...5Un) * Vis V2,+++5Vn 
= (Cuy, CU2,...,CUn) * V1,V25--+5Vn 
= CU,Vy + CU2V2 + +++ + CUnVy 
= C(uyvy + U2v2 + +++ + UnVn) 
= c(u-v). 
dot product zero Suppose u-u = 0 for some vector u = u, U2,..., Uy in R”. That is, uy +U; tees +u2 = 0. Since this 
is a sum of nonnegative real numbers that add to zero, each term must itself be zero: uw; = uz = +--+ = uy, = O. 
Hence u = 0. Now suppose u = 0. Then uw; = uz = ++: =u, = Oandu-u = uy +u5++--+u2 =0+0+---+0=0. 


inner product on polynomials Given polynomials p(x) = po + pix + p2x*, G(x) = qo + qix + qox*, and r(x) = 
ro tr X + rx in P2(R) and the operator 


2 2 2 2: 
(p(x), g(x) = 5 P22 + 3 Poda + Pig + 3 P240 + 2P04o, 


the most challenging part of the verification is the need for some fancy algebra to show properties 1 and 2. The 
expression (p, p) is manipulated into a sum of squares for this purpose. 


1. For any p in P2(R), 


2, 4 2 
(p, p) = 52 + 3 PoP + 3PI + 2p) 
1 
5 [6p + 20pop2 + 10p} + 30p%| 


1 
ie [(2p2 + Spo)” + 2p3 + Spo + 10p;| 


which is a sum of squares and therefore greater than or equal to 0. 


2. <p. p) = 75 |(2p2 + Spo)’ + 2p3 + Sp5 + 10p;| equals 0 if and only if po = py = p2 = 0 since the square 
of each appears as a term in the sum. And po = pi = p2 = 0 if and only if p(x) = 0 (that is, p = 0). 


3. For any p,q in P2(R), 


a 2 2 > 
{P, 4) = 5 P2942 + 3 Pog + 3Pi + 3 P2G0 + 2po04qo 


oie + z + a + z +2 
= Siete 3 12P0 371P1 3 1oP2 qoPo 


ae + : + 2 + z +2 
= 5 42P2 3 40P2 37P1 3 22P0 qoPo 


= CE P) 
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4. For any p,q,r in P2(R), 


(p+gan= ((po + pixt pox’) + (go +qixt+ gx’) >fotrnxt rx°) 


= ((po + qo) + (pi + qi) x + (po +), r+ NX + Mx) 


2 2 2 2 
= 5 (P2tg)re+ 5 (Pot gor + 3 (pit agi)n + 3 (P2 + G2) 70 + 2 (Po + qo) ro 


2; rae ge oe ice fe 
= r r r r r r 
5 Pore 5 4202 3 Pore 3 fore 37! 1 31! 1 


2 2 
+ Pato + 3 g2r0 + 2poro + 2Goro 


2 ge qi x2 +2 
= r r r r r 
52 2 30 2 371 1 3P2 0 Potro 


+ a + Z + : + z +2 
=92"2 + sqor2 + sqini + sgon rn 
5 42!2 + 302 + GN + Za2"0 + “Goro 
= (p.r) + (4,1) 
5. For any p,q in P2(R) and scalar c, 
(cp, = (c(po + Pix + Pox) » go+qx+ Mx) 
= (cpo + Cp\X+CpoxX, got qxt wx) 


2 2 2 2 
= 5 OP 242 + 3 POW + 3oPia + 3 oP 240 + 2cpoqo 


2 + 2 + z + z +2 
=C 

5 P242 3 Pod2 3 P14 3 P240 P0 qo 
= c(p, 4) 


Thus (p,q) = 2poq + = Pog + SPiqi + = P2qo + 2poqo satisfies the five properties of an inner product (and is 
therefore an inner product on P2(R). 


Exploring Vector Spaces and Inner Product 
Spaces 


5.1 Solution Spaces [3.3, 3.6, 4.1, 4.2] 


Given a coefficient matrix M and a particular vector b, we can use row reduction to determine whether a solution of 
Mv = b exists and find it if it does (section 2.2). We even have an efficient way of finding all the solutions when 
there are more than one (section 3.7 page 111). Being able to do this on a case-by-case basis is good, but a more 
critical look at the patterns of free and basic variables leads to more complete understanding of solution sets of linear 
systems. 


Observations 


Let M be an m X n matrix and set C as the set of all linear combinations of the columns of M and WN as the solution 
set of Mv = 0. That is, 

C={Mv:veR"} 

N ={veR": Mv=0}. 
Now observe that C and WN are vector spaces. For one, C is the collection of all linear combinations of the columns 


of M. In other words, C = span{M., M.2,...,M-.,,} and is therefore a vector space (see “span is a subspace” on page 
123). For the other, we need to check three things (section 4.1 pages 119 and 120): 


1. MO=0so0EN. 
2. For any uand vin N, M(u+ v) = Mu+ Mv =0+0=0sou4+visinN. 
3. For any uin N and scalar c, M(cv) = c(Mv) = c0 = OsocvisinN. 


The zero vector is in N and N is closed under vector addition and scalar multiplication. Hence N is a vector space. 

C is called the column space of M and N is called the null space of M. The dimension of the column space of 
M is called the rank of M and the dimension of the null space of M is called the nullity of M. For any eigenvalue A 
of M, the null space of M — Al is called the eigenspace of M corresponding to J. 


Implications 


Row operations were defined to maintain the solution sets of linear systems. Solutions of a row reduced linear system 
are solutions of the original linear system. Using the matrix form for a linear system, this means given a particular 
vector b, Mv = b has the exact same solution set as (EM)v = b for any elementary matrix E. In particular this means 
the null space of M (the solution set of Mv = 0) and the null space of EM (the solution set of (EM)v = 0) are equal 
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for any elementary matrix E. Row operations do not affect the null space of a matrix. Stated another way, v is in 
the null space of M if and only if v is in the null space of EM. 

To rigrously prove this claim, let M be an m X n matrix and E be an n X n elementary matrix. If v is in the null 
space of M then Mv = 0. Hence (EM)v = E(Mv) = E0 = 0, so vis in null space of EM. [This establishes that if v 
is in the null space of M then v is in the null space of EM.] On the other hand, if v is in the null space of EM then 
(EM)v = 0. Hence E(Mv) = 0 so Mv = E~'0 = 0 and v is in the null space of M. [This establishes that if v is in 
the null space of E'M then v is in the null space of M.] Altogether this means the null space of M and the null space 
of EM contain exactly the same elements. We have thus established that the following two statements are equivalent 
for any m X n matrix M. 


1. vis in the null space of M. 


2. vis in the null space of EM. 


To take it a step further, this means a certain set of columns of M are linearly dependent if and only if the same set of 
columns of EM are linearly dependent. 

Thinking of matrix-vector multiplication as taking a linear combination of the columns of the matrix, any linear 
combination of the columns of M can be expressed as Mv for some vector v. To say that some set of columns of M are 
linearly dependent is to say there is a nonzero vector v such that Mv = 0. The nonzero entries of v determine the set of 
columns. Since Mv = 0 if and only if E(Mv) = E0 (because F is invertible) if and only if (EM)v = 0, a certain set of 
columns of M is linearly independent if and only if the corresponding set of columns of EM is linearly independent. 
More precisely, the exact same linear combination of columns of M that sums to zero, taken of EM instead, will also 
sum to zero. Row operations do not affect the linear dependence relationships among the columns of a matrix. 


Crumpet 23: Uniqueness of Reduced Row Echelon Form 


The fact that row operations do not affect the linear dependence relationships among the columns of a matrix lies at 
the heart of a proof that the reduced row echelon form of a matrix exists and is unique. The row reduction algorithm 
provides existence. 


Suppose that an m x n matrix M has two reduced row echelon forms, A and B. The pivot columns of A and 
B are exactly those columns that are linearly independent of the columns to their left. This follows from the facts 
that (i) each pivot column of a matrix in reduced row echelon form is linearly independent of the columns on its 
left (it has a nonzero entry in a row where all the columns to its left have zeros); and (ii) each non-pivot column is 
linearly dependent on the columns to its left (it can be written as a linear combination of those columns). Because 
row operations do not alter the linear dependence relations among the columns of a matrix (and A and B are the 
results of series of row operations on M), the pivot columns of A must be the same as those of B. Since the pivot 
columns of a reduced row echelon form contain a | in the pivot position and zeros elsewhere, the pivot columns of A 
and B are in fact equal. 


Suppose A. ; is a nonpivot column of A. Then A.; and the pivot columns to its left (if any) form a linearly 
dependent set. A nontrivial linear combination of them sums to 0. Since B. ; has the same linear dependence relation 
with the pivot columns to its left, the same (corresponding) linear combination of B. ; and the pivot columns to its left 
sums to 0. Since the pivot columns of A and B are equal, it follows that B. ; = A. ;. This completes the proof. 


Bases for the null space and column space of a matrix 


The very nature of the row reduction algorithm followed by writing solutions of the homogeneous equation Mv = 0 
in parametric form provides a basis for the null space. Each free variable gives rise to one vector in the parametric 
vector form (see section 3.7). The collection of all the vectors from this form comprise a basis. 


In full detail, if vy,,vp,..., v4, are the free variables of the linear system Mv = 0, then the row reduction algorithm 
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leads to a solution of the (parametric vector) form 


41,1 41,2 a1 

42,1 422 a2. 
V=r ; +12 ; Peer LE 

an an? ank 


This form constitutes all the solutions of the homogeneous equation, so the columns of the matrix 


41 M2 "7° aLk 
421 422 +7" GK 
A= 
Gni An2 *'* ank 
span the null space of M. The algorithm also provides that ay; = 1 while ap. = dg,3 = +--+ = af,4 = 0. Similarly, 


the entries of A,,: are all zero except the second; the entries of row A,,: are all zero except the third; and so on. 
Consequently the columns of A are linearly independent. Hence the columns of A (the vectors of the parametric form 
of the solution) are a linearly independent spanning set—a basis—for the null space of M. 

Now suppose vp,, Vp,,.--, Vp, are the basic variables for the linear system Mv = b and hb; < by <--- < be. We will 
argue that columns M.»,, M.»,,...,M.», form a basis (linearly independent spanning set) for the column space C. To 
see that these columns are linearly independent, let R be the reduced row echelon form of M. Then R.,, (column b; 
of R) cannot be written as a linear combination of columns R.,,,R.p,,... Ro, (the columns to the left of column b; 
corresponding to basic variables) for any j > 1. This is clear since Rj», = 1 while Rjy, = Rip, = --- = Rio, = 0. 
This is sufficient to conclude that R.,,,R.»,,...,R:», are linearly independent. Can you show that, in general, the 
vectors V},V2,--.,Vp, Vi # 0 are linearly dependent if and only if there is a k > 1 such that v, can be written as a 
linear combination of Vv, V2,..., Vx-1. Answer on page 165. This completes the argument that R.»,,R.»,,...,R:p, are 
linearly independent. Because row operations do not affect the linear dependence relationships among the columns 
of a matrix, we have that M.»,, M:»,,...,M-.», are linearly independent. 

To see that M.»,,M.»,,...,M:», span the column space of M we will rely on the fact that if v is in the span of 
(the arbitrary set of vectors) V = {v,,V2,..., Vp}, then spanV = span (V U {v}). Can you support this claim? Answer 
on page 165. To see why this is useful, note that if column f of R corresponds to a free variable, then R.,, is a linear 
combination of the columns of basic variables to the left of R., say R.p,,R:p,,--- Rp, where b; < f < bj41. This 
is because R,,, the leftmost column corresponding to a basic variable, has a | in its first entry and zeros elsewhere; 
Rp, has a | in its second entry and zeros elsewhere; Rp, has a | in its third entry and zeros elsewhere; and so on. By 
construction, R.¢ cannot have a nonzero entry below row ); (if it did, it would contain a leading entry and therefore 
not be the column of a free variable). In symbols, 


1 0 O Rif 
0 1 O Rog 
[ Rp, R.p, pee. R: 0; Rif | =10 0 1 Ris 
0 0 O 
0 0 --. 0 O 
Hence, R. ¢ is a linear combination of R.,,,R.,,,...,R:»,- Because row operations do not affect the linear dependence 
relationships among the columns of a matrix, we have that M.,- is a linear combination of M.»,, M.»,,...,M.),. In 
other words, M. , is in the span of M.,,,M.»,,...,M.»,. Repeatedly adding the columns of free variables (which are 
in the span of the columns of basic variables) to the set {M.»,,M.»,,...,M.»,} leads to the conclusion that 


span{M. », > M.»,, sey M.»,} = span{M. 1, M.2, ry M.»)} 
= column space of M. 
Hence {M.y,, M.»,,...,M.»,} is a linearly independent spanning set—a basis—for the column space of M 


The preceding discussion justifies two general statements about any m Xx n matrix M, the combination of which 
leads to one of the most fundamental theorems of linear agebra. 
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e a basis for the null space of MW can be formed using one vector for each free variable of the linear system 
Mv =b. 


e a basis for the column space of M is formed from the columns of M corresponding to the basic variables of the 
linear system Mv = b. 


These statements mean the rank of M equals the number of basic variables of Mv = b, and the nullity of M equals 
the number of free variables of M. Since Mv = b has n variables in total, we have the following theorem. 


Theorem 12. [Rank and Nullity] /f M is an m x n matrix, then the rank of M and the nullity of M sum to n. 


General Solutions 


Let v, be a particular (any single) solution of Mv = b and vj, v2,..., Vv, be a basis for the null space of M. Now 
suppose v is any solution of Mv = b. Then 


M(v —v,) = Mv —- Mv, 


=b-b 

= 0. 
By definition, v — v, is in the null space of M. Because vj, v2,..., Vv, is a basis for the null space of M, there are 
coefficients a), d2,..., a, such that v — v, = a)Vj + a2V2 +++: + agvx. Hence 


V=HVp t+ avi + d2V2 + +++ + AKVy. 


On the other hand, if v = v, + a1 Vv) + d2V2 +++: + axV% for some coefficients a1, a2, ...,ax and particular solution v,, 
then 
Mv = M(v, +4,Vj +doVo +-°°° + akVk) 
= Mv, + M (av, + d2V2 te++ + AV) 
=b+0 
=b. 


To summarize, these comments justify the following theorem. 


Theorem 13. [Characterization of Solutions of a Linear System] For a consistent linear system Mv = b, the 
solution set is 
V=HVptaiV + d2V2 + +++ + AVE 


where Vv, is any particular solution of Mv = b and V,,V2,..., Vx is a basis for the null space of M. 
Key Concepts 
column space of an m Xn matrix M is span{M., M.2,...,M.n} ={Mv: ve R"}. 


null space of an m Xx n matrix M is {v € R” : Mv = 0}, the solution set of Mv = 0. 

eigenspace of M corresponding to 2 is the null space of M — al. 

vector spaces the column space of a matrix and the null space of a matrix are vector spaces. 

column space basis the columns of M corresponding to basic variables form a basis for the column space of M. 


null space basis the vectors in the parametric vector form of the solution of Mv = 0 form a basis for the null space 
of M. 


characterization of solutions of a linear system see theorem 13. 
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row operations (i) do not affect the null space of a matrix, and (ii) do not affect the linear dependence relationships 


of the columns of a matrix. 
rank the dimension of the column space of a matrix. 


nullity the dimension of the null space of a matrix. 


Exercises 


1. Is b in the column space of M? 


@ M=| , 5 FP=| a0 | 
ome 2 ho[3| 
ou[ 2 2 p-[ | 

(d) u=| a re je-| : | [S]-315 

(e) w=| : ei j»=| Hs | [A]-355 
(f) u =| " i e-| = | [A]-355 


2. Find a basis for the column space of M from question 
1. [S]-315 [A]-355 


3. Find a basis for the null space of M from question 1. [S|- 
315 [A]-355 


4. Ris the reduced row echelon form of [ M »b iF the ma- 
trix M augmented by some vector b. (i) Is b in the col- 
umn space of M? (ii) Find a basis for the column space 
of M. (iii) Find a basis for the null space of M. 


-20 -90 6 153 
(a) M=| -6 -27 2 45 | 
A TS: 2 236 
192 00 1 
R-10 0 10 6 | a 
0 0 0 1 -10/9 
—o if 97 3 
(b) u-| 36 8 44 #1 | 
ca 412 =66 3 
i 20 <117o. Go F9 
R-|0 0 0 1 a 
0 O 0 oO 0 
=154 -=30° <0 <76 35 
owe) 121 20 9 56 2 
Osh og ia 
i OO Sal =77T 1 
R=| 0 1 0 <8 75 | 
OO Dd 8G FG 4 
187 99 -74 -12 
OG <i -—) 4 
G@)M=) 33 _» WW 4 


-154 -77 63 12 


1 0 4 0. 2D 
0 1 S/ll 0 0 
R=|5 9 9 1 9 | [Sh316 
00 0 O01 
oy hh ae 
-126 88 9 -103 
OMe) gs 85 9. (37 
“lf. 7 -& =103 
1 0 0 10/9 0 
pu}? 1 9 ll oO 
001 7/33 0 
co oD GO 4 
je G ii 
-36 15 58 33 
@QM=!) 54 9 -36 -11 
60 =<30 =110 <33 
i 0 =1% 0° = 
0 1 8/3 O -1/3 
R=) 6 0. 0 1 -9/11 Aes 
00 0 0 0 
“1% 92 <96 38 
G 04 BF 6 
(2) M=| -30 80 -90 20 
— fg af gy 
24 -64 72 -16 
1 =8/3 3 =23 0 
0 0 0 0 1 
R=|0 0 0 0 0 
0 0 0 0 0 
0 0 0 0 0 
{3 = & <4 
=46 =<20 “16 =192 
(h) M=| 24 24 24 ~~ 156 
=34 = =16 <84 
96. 39. <78 492 
i -o 0 oO - 37, 
0100 1/2 
R=|0 0 1 0. 3/2 | [A]-355 
000 1 11/12 
0000 0 
-108 54 -63 40 
-120 60 -70 5 
(i) M=| -60 30 -35 50 
—24 12 =14 -10 
-48 24 -28 5 
1 -1/2 7/12 0 0 
0 0 eC. 1 °% 
R=|0 0 0 oO 1 
0 0 0 0 0 
0 0 0 0 0 
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-5 -14 -3 20 -22 
3 | 0 -20 22 


Gg) M=} -13 -21 -24 90 -77 
3 7 -3 -30 33 
10 21 6 -80 77 
1 00 0 0 -3 
0 100 0 2 
R=|0 0 1 0 0 14 | [A]-355 
0 0 0 1 0 -8 
0 0 0 0 1 = =3 


5. Make general statements about how the reduced row ech- 
elon form of [ M »b | helps (i) determine whether b is 
in the column space of M; (ii) find a basis for the column 
space of M; and (iii) find a basis for the null space of M. 


6. M row reduces to R. (i) Verify that b is in the column 
space of M and (ii) find the general solution of Mv = b. 


—s 35. 4 7 -= 97 
(a) M= 7 63 72 42 15 
1 2 F B 


1 eae 8/3 0 773 
R=|0 0  O 1 | 
0 0 0 0 0 
2) 
»-| 69 i 
16 


14 =95 
(b) M=| 4 ey : of -15 


i 38 0 3 
R=i0 0 1.4 =7p 
o & Oo 


-91 104 -2 -18 
(c) M=| 56 -64 2 9 


-154 176 -6 —-24 


1 -8/7 0 0 
R=| 0 0 1 0 
0 0 0 1 


8 221, =12 18 
(d) u-| 9 -56 -32 50 | 

-6 35 20 -30 
10 0 0 

R=|0 1 4/7 | 
00 0 1 
3 

b=] 8 | [A]-355 
-5 


21 -35 15 -84 
-18 30 -10 68 


(e) M | 6 -10 5  -25 


0 


1 -5/3 0 -3 
0 1 -7/5 
0 0 


7. Given an arbitrary m x n matrix M and a vector b # 0, is 
the solution set of Mv = b a vector space? Explain. 


8. Columns 2 and 5 of matrix M form a basis for the column 
space of M. Use this information to help decide whether 
Mv = bis consistent. 

90 240 -120 240 -40 


3 8 -4 8 0 
M=) Ag 198 64 «128 20 [eS 
-108 -288 144 -288 45 
200 
16 
=116 
~261 


9. Let M be a4x5 coefficient matrix with columns 2,3, and 
4 representing free variables. Argue that the set contain- 
ing the fifth column of M and any one of the first four 
form a basis for the column space of M. 


3 10 4 #7 10 
10. LetM=] 9 45 18 7 Jandb=} 45 |. 
9 25 10 28 25 


(a) Solve Mv = b by inspection. (Row reduce or use 
SageMath if you don’t see it, but then reflect on 
why you did not see it.) 


0 


(b) Use the fact that is a basis for the null 


—2 
5 
0 

space of M to write down all the solutions of 


Mv = b in parametric vector form. 


(c) Write down three distinct solutions of Mv = b all 
different from the solution in (a). 


4 5 -5 -12 24 
ll. LetM=| 2 5 -5 -3 |>ao- 6 | [S]- 
0 -15 15 -15 30 
317 


(a) Find one solution of Mv = b by inspection. (Row 
reduce or use SageMath if you don’t see it, but then 
reflect on why you did not see it.) 


0 


is a basis for the null 
0 
space of M to help write down all the solutions of 


Mv = b in parametric vector form. 


(b) Use the fact that 


(c) Write down three distinct solutions of Mv = b all 
different from the solution in (a). 


5.1. SOLUTION SPACES [??, ??, ??, ??] 


165 


154 242 15 -9 3 15 -12 
12. Let M = 63 99 5 -3 | and b = (a) | 3 7 4 |;A=6 
-112 -176 -10 6 6 -10 20 
i 47 45 -75 
288 F (b) 5 31 -I15 |;A=22 [S]-317 
: 15 27 -23 
(a) Solve Mv = b by inspection. (Row reduce or use 1 4 
SageMath if you don’t see it, but then reflect on 8 a 
: : (c) | -10 -1 10 J;A=14 
why you did not see it.) 
-4 -6 18 
-11 0 
b he fact th | OTN is a basis f  S 
(b) Use the fact that 0 | 3 is a basis for (ay 46 42 =12 aii (apsss 
0 5 20 25 7 


the null space of M to write down all the solutions 


GF ide— Wananncitie ventana 16. Find a basis for the eigenspace corresponding to (the 
: P ae ; : eigenvalue) 2 in question 15. 
(c) Write down three distinct solutions of Mv = b all 
different from the solution in (a). 17. What is the rank of a matrix of all zeros? 
13. Use the fact that row operations were used to reduce 18. What is the rank of a matrix of all ones? [A]-355 
-16 72 -15 -91 —182 -6 19. What is the rank of a matrix of all twos? 
6 -27 5 33 63 3 . ; . : . 
A= 2 -9 5 17 14 6 to B = 20. Given an m X n matrix M and invertible n x n matrix P, 
-10 45 —-5 -49 -9] 6 show that the rank of MP equals the rank of M by the 
2-9 08 0 0 following argument. Let Cy be the column space of M 
005900 and Cyp be the column space of MP. 
to find a set of three columns 
; " : : : (a) Suppose v is in Cy, and show that v is in Cyp. 
of A that are linearly independent. Are there other such (b) Suppose v is in Cup, and show that v is in Cy. 
sets? [A]-355 (c) Conclude that the rank of MP equals the rank of 
8 -16 -18 8 0 0 M. 
14. LetA =] 16 k -45 |,B=] 0 8 9 |, and : ; : : : 
8 32 36 00 0 21. Given an m X n matrix M and invertible m x m matrix Q, 
0 show that the nullity of QM equals the nullity of M by 
v=| 9 the following argument. Let Ny be the null space of M 
~ 3 and Now be the null space of QM. 


(a) Verify that Bv = 0. 
(b) Given that A row reduces to B, find k. 


15. What is the dimension of the eigenspace corresponding 
to (the eigenvalue) A? 


Answers 


linear dependence Show that the vectors vj, V2, .. 


combination of vj, Vo,.. 


(a) Suppose v is in Ny, and show that v is in Now. 
(b) Suppose v is in Now, and show that v is in Ny. 


(c) Conclude that the nullity of QM equals the nullity 
of M. 


Vp, V1 # O are linearly dependent if and only if there is a k > 1 
such that v;, can be written as a linear combination of vj, Vo,.. 
., Vk-1, we have immediately that v,, v2,.. 


.,Vk-1. Supposing v; can be written as a linear 
.,V, are linearly dependent. Now suppose 


span 


V1,V2,---,Vp, V1 # 0 are linearly dependent. Then there exists a linear combination 
a1Vi + d2V2 +++ + ApVp = 0. 


It must be that at least one of az, a3,...,@p) is nonzero since v; # 0. Set k = max{i : 
a, # 0, and ayv; + doV2. +--+ + ayV;, = 0, So 


a; # 0}. Then k > 1, 


Show that if v is in the span of (the arbitrary set of vectors) V = {vj, V2,...,Vp}, then spanV = span (V U {y}). 
First, spanV C span(V U {v}) always since every linear combination of the vectors in V is also a linear 
combination of vectors in V U {v} (with the coefficient of v equal 0, for example). It remains to show that 
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span(V U {v}) © spanV. In other words, if w € span(V U {v}) then w € spanV. To that end, suppose 
w € span(V U {v}) and (a) write w as a linear combination of vj, V2,...,V,,Vv; and (b) write v as a linear 
combination of vj, V2,...,V, (which is possible since v is in the span of V): 


W = dV + d2V2 + +++ + ApVp + AV 


Vv = bv, + bov2 + +++ + DpVp. 
By substitution, 


W = QV] + 42V2 + +++ + ApVp + ADV, + boV2 + +++ + bpVp) 
= (a; + aby )v, + (az + abz)V2 + +++ + (dy + aby )Vp 


and therefore w € spanV. 
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5.2 Coordinate Vectors [4.1, 4.2] 


In section 4.2 it was noted that given a basis for a vector space, each vector has a unique (exactly one) representation 
as a linear combination of the vectors in the basis. That means there is a one-to-one correspondence between elements 
in the vector space and linear combinations of vectors in the basis. Each vector in the vector space can be identified by 
its corresponding linear combination, or more succinctly, by the list of coefficients in that linear combination. These 
coefficient lists serve as unique identifiers for the vectors much the same way social security numbers serve as unique 
identifiers for people. A country with a social security system assigns each of its citizens exactly one social security 
number. Each citizen has one social security number and each social security number identifies one person. 

The situation is a little more complicated when a person is a citizen of more than one country with a social security 
system. The same person will have multiple social security numbers, one for each country of which they are a citizen. 
One social security number may be useful in France while another is useful in Mali. In a similar way, a single vector 
will have multiple unique identifiers, one for each basis of the vector space. Each basis gives a different labeling 
system for the vectors (citizens) of its vector space. 

As pointed out in section 4.2, 


6= {L.1, 1.2, tee Ln} 


is a basis for the vector space M,,.;(R) (the collection of all n x 1 column matrices with real coefficients), better 
known as R"!. Writing a vector in R” as a linear combination of these basis elements is a simple matter: 


T 
[ Xi) XQ +t* Xp | = x1.) + xol.g +++ + Xplen, 


and this is the only such linear combination. If we write the coefficients as a column vector decorated with the name 


xX] 
x2 
of the basis asin] . , we have what is known as a coordinate vector. The vectors we have been writing all along 
Xn 
have been written with respect to the standard basis, and we will maintain this practice. A vector written with no basis 
x1 x) 
x2 x2 
indicated means that it has been written with respect to the standard basis. Hence] . and| . | represent the 
Xn 6. Xn 
same vector. Given a different basis, say V = {V1, V2,..., Vn}, writing [ Xp XX. ++ Xp | as a linear combination 
may require different coefficients: 
T 
[ Xp X2. «t+ Xp | = C]Vy + €2V2 + °° + CnVy 
xX) 
x2 
for some c,C2,...,Cy. The coefficients of this linear combination, in order, form the unique identifier of } . | in 
Xn 
Cl 
C2 
the context of the basis V: . , read as the coordinate vector with respect to V. 
Cn 


It should be noted that the order of the elements in a basis matters, and that the entries in the coordinate vector 
must correspond to this ordering. The first entry of the coordinate vector corresponds with the first vector in the 
basis. The second entry of the coordinate vector corresponds with the second vector in the basis. And so on. This 
correspondence is required to maintain the uniqueness of representation. Different orderings of the same set of vectors 
provide different coordinate systems for the vector space. A basis is thus an ordered set. 


'Technically, Mn x1 (R) and R” are isomorphic (see section 4.5) 
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The basis 8 = {/.9,1.1,...,/:,} is different from the standard basis & (even though, as unordered sets, B and & 
are equal). Of course, it is still true that 
T 
[ Xy XQ +t Ky | = xl.) = Xoo sei Xplin. 
T 
This fact will never change. But this means the coordinate vector of [ Xp X22 ++ Xp | with respect to B is 
x2 
x) 
. . Hence 
Xn B 
XxX] XxX] X2 
x2 X2 xy 
Xn Xn & Xn 8B 
1 1 1 
Letting C= {| O |,) 1 |,} 1 |}, can you verify that 
0 0 1 


SLASHISL 


Answer on page 174. As a matter of notation, when the basis with respect to which a vector v is written is important, 
we will enclose the name of the vector in square brackets and subscript it with the name of the basis as in [V]g. 
Notice that 


2 1 1 1 1 11 2 
[? | =2] 0 /4+7 1-3] 1 |=] 1 | | 
-3 eS 0 0 1 00 1 -3 
and 
4 0 1 0) 0 1 0 4 
|< | aati o|-s] 0f=[1 ; | ; 
-3 r 0 0 1 0 0 1 -3 


In general, if B is a basis of R” and [B]¢s is the matrix whose columns are the vectors of 8 written with respect to the 
standard basis respecting order, then 

v=Ivle = [Ble [v]s (5.2.1) 
Note, however, there is nothing special about the standard basis beyond the fact that it is the most familiar. If C is a 
basis of R” and [B]c is the matrix whose columns are the vectors of 8 written with respect to the basis C respecting 
order, then 

v= [vo = [Ble [vlg. (5.2.2) 


This can be verified by direct calculation. Let 8 = {b;,bo,...,b,} and C = {c;,¢2,...,¢,} and write the vectors of 8 
with respect to C: 


b, = Mic + MoCo Sea Mnitn 
b> = Mi 2¢) + Mo 2€ tere t Mn2€n 


by = Mine, + Manto +-+++ Mann (5.2.3) 


and v with respect to B: 
vV=v bi) + obo +--+ + V,dp. (5.2.4) 
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Then 
Mii Mir --- Min 
Mz, Mor --- Moy 
[Ble = : 
Mn Mn2 Man 
and 
Mii Mir +: Min |[ v1 Mi, Mi2 Min 
Mr; Mor +++ Moy V2 Mo) M22 Mo), 
[Ble [v]g = : : ‘ é = Vi i + V2 A tees tVy, 
Mn Mn2 nae Man Vn Maa M2 Man 


VM + v2Mi 2 ++++ + VnM in 
V1 Mo) + v2Mo2 +--+ + V,Marn 
= ‘ : (5.2.5) 
V1 Maa + v2Mn2 ai VaMan 
On the other hand, direct substitution of (5.2.3) into (5.2.4) yields 
V=V (Mi 1c) + Mp 102 Bia a Mien) 
+ 2 (M1 2¢) + Mo2€9 + +++ + Mn2€n) 


+ Vy (Mi nci Bi Mp C2 se ar Mi nln) 
= (vy M11 + v2My 2 + +++ + VaMi np) e1 
+ (11M, + v2Mo2 ea ie VnMon) Co 


+ (ViMna + v2Mn2 a a Saran VaMnn) Ch 


which verifies that (5.2.5) is [v]c. Equation (5.2.2) is one formula for a so-called change of basis. It gives a formula 
for changing the basis with respect to which v is written from 8 to C. 

Being a basis of R”, 8 contains n linearly independent vectors with n entries each, so [B]c is ann Xn matrix with 
linearly independent columns, making it an invertible matrix (theorem 7). In particular, if both bases 8 and C are 
written with respect to the standard basis and v is an arbitrary vector, we have v = [B]s [V]g and v = [C]g [Vle, so 


[Ble [vlg = [Cle lvle 
and we can left-multiply both sides of this equation by [cles yielding 
[vlc = [C]g! [Ble [v]s- (5.2.6) 


Comparing this equation to (5.2.2) it must be that [B]¢ = [Clg (Bg. 
In retrospect, this should not be surprising. Multiplying [v]g by [B]s gives [v]s (see equation (5.2.1)). This same 
equation tells us that multiplying [v]s by [C ls gives [v]c. Diagrammatically 


times [B]s times (Clg! 
vg — Wle — = Ile, 


another way to understand (5.2.6). 


Key Concepts 

coordinate vector If 8 = {b,,b2,...,b,} is a basis for a vector space and v = c,b; + cob2 +--+ + c,Dp, then 
Cc 
C2 
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is the coordinate vector of v with respect to 8 and may be denoted [Vv]. 


change of basis given bases 8 and C of a vector space V and v € V, 


[vlc = [Ble [v]z = [C]g' [Ble [vs - 


Exercises J) 1 Oo a 0 0 
{fo oflo a]{) of 
1. LetB= { : |, 7 |} What vector v is given by the [S]-317 
: 1 0 0 2 0 0 
ret vector? : ; : (d) B={ 0 0 || 3 0 |, 01 | 
| 1 »| 4 | ©| 2 @| 5 | [A]-355 
: is ' a 1 0 0 1 
2][ 6]{[ 9 © 8={| 4 ate al 
2. Let B = | —4 ],] 14 },) 15 | Find the vector x 0 0 0 0 
2) Ls IIs fr oflo a} 


determined by the coordinate vector. 
() B= 3 8 2 6 
Os 0 ds 3 Is 8 3 4)7| 5 4 
3. Let B = {-5x + 3, 3x + 8}. Find the vector x determined at Weta the vette =-46, 5. =aras aecainnie sector 


by Hie cograuiatevectan with respect to the basis 8. [8 is linearly independent 


(a) 


(a) —4 | (b) | =t | (c) | 1 | (d) 5 | and therefore a basis for its span, and v is in span8.] 
—2 |, -1 |, 8 |, 0 |, 
‘ : * (a) B= ((2, 1,0), (0,0, 1)} 
4. Let B= { a | | Pa | (b) B= {(1,0,0), (0,3,~4) 
e ire ce a (c) B = {(1,0,0),(0, 1,0), (0,0, 1)} 
| a8 | | a. a2 } Dee aes re (d) B= {(6,0,0), (0,3,0), (0,0, -4)} 
Compute x. (ce) B= {(2,-1,9), (6,3, -4), (-8, 1, D} 
4 1 1 -3 [S]-317 
= = fit A 
@) 2} mls} of P| @l () B={(3,4.-3)} [Ab3SS 
1 -1 5 5 8. Write the vector v as a coordinate vector with respect to 
he basis B = ; > 
5. Write the vector v = 3 — 4f + 5f as a coordinate vector me DS —2 }?| -9 If- 
with respect to the basis 8. [B is linearly independent [S]-318 
and therefore a basis for its span, and v is in span8.] 7 
or] 
(a) B= (3-4, P} [S]-317 12 
—2 
(b) B= {1,47} [A]-355 (b) v= 3 | 
(c) B= {P, Ei 1} 9. Write the vector v as a coordinate vector with respect to 
' 6 —5 2 
(d) B= {1% Pe} the basis B = { 3 |, _8 |} of R*. [A]-355 
B= 13, -41, 5° 
aces) wel | 
(f) B= (3-41+5P?,1-1+ 87, 14-51 +67} 
7 mel 
6. Write the vector v = | as a coordinate vector 
3 4 10. Write the vector v as a coordinate vector with respect to 
with respect to the basis 8. [ is linearly independent ; 1 5 : 
and therefore a basis for its span, and v is in span8.] the basis B = { 9 | > | 1 |} of R’. 


‘LLs oh @ v=[ 15 | 
fle al mr=| is 


or 


we 
ve 


or 
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11. 


12. 


13. 


14. 


15. 


16. 


17. 


18. 


19. 


20. 


21. 


22. 


Write the vector v as a coordinate vector with respect to 


AEE 


@v=[17 10 5] 


the basis B = 


| of R?. [A]-355 


Write the vector v as a coordinate vector with respect to 


3 2 6 
1 |.) 1 }5) 4 of R?. 
4 9 -9 


@v=[ i 4 4] 


the basis 8 = 


(b) v=[6 -l 10 |’ 
() v=[ 16 8 -17 |’ 


Write down the change-of-basis matrix [B]g for the ba- 
sis in question 8. Multiply v by [Ble and compare your 
answer to that from question 8. [S]-318 

Write down the change-of-basis matrix [8B]; for the ba- 
sis in question ??. Multiply v by [Bie and compare your 
answer to that from question ??. 

Write down the change-of-basis matrix [8]; for the ba- 
sis in question 9. Multiply v by [Bz and compare your 
answer to that from question 9. [A]-356 

Write down the change-of-basis matrix [B]g for the ba- 
sis in question 10. Multiply v by [Ble and compare your 
answer to that from question 10. 


Write down the change-of-basis matrix [B]g for the ba- 
sis in question 11. Multiply v by [Ble and compare your 
answer to that from question 11. [A]-356 

Write down the change-of-basis matrix [8B] for the ba- 
sis in question 12. Multiply v by [Biz and compare your 
answer to that from question 12. 


Given bases B = = F ; and C = 
7 -6 
3 2 2 ' ; 
7 \l 4 of R°, find the change-of-basis matrix 


[Ble. [S]-319 


Given bases B = 
6 -5 
1 ]7| 7 

[B 

Given bases B = { 


LoL 6 J 


trix [B]e. [A]-356 


(ae eees 


| of R’, find the change-of-basis matrix 


2Hsp me - 


of R’, find the change-of-basis ma- 


Lo }15}} me - 


| | : |} of R’, find the change-of-basis matrix 


Given bases B = 


23. 


24. 


25; 


26. 


oii 


28. 


29. 


+) Seika? 56 Given bases 


4 -7 -8 
o-| || ef | 3 | ms 
0 4 5 
0 8 2 
Cc -{ 2 | 6 | -5 
8 8 
basis matrix [B]¢. [A]-356 


-7 
+) So ease? 57 Given bases 


HII 
BIH 


basis matrix [B]c. 
| and 8 and C be as in question 19. [S]- 


of R°, find the change-of- 


NAR WN OO 


of R°, find the change-of- 

2, 

Let v = e 
319 

(a) Find [v]g. 


(b) Find [v]o. 


(c) Using your answer from question 19, calculate 
[Ble [v]g and verify that it equals [v]c. 


Let v = - | and 8 and C be as in question 20. 


(a) Find [v]g. 
(b) Find [v]ce. 


(c) Using your answer from question 20, calculate 
[B]c [v]g and verify that it equals [v]c. 


Letv = | ; | and 8 and C be as in question 21. [A]-356 


(a) Find [v]g. 
(b) Find [v]c. 


(c) Using your answer from question 21, calculate 
[Ble [v]g and verify that it equals [v]c. 


Let v = | 7 | and 8 and C be as in question 22. 


(a) Find [v],. 
(b) Find [v]c. 


(c) Using your answer from question 22, calculate 
[B]c [v]g and verify that it equals [v]c. 


3 
\) Sage ath Cell PA rea | 2 
1 


and 8 and C be as in 


question 23. [A]-356 
(a) Find [v]g. 
(b) Find [v]c. 


(c) Use your answer from question 23 to calculate 
[Bc [v]g and verify that it equals [v]c. 
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30. QEETETESD 50 ety | -6 


31. 


7 
and 8 and C be as 


3 

in question 24. 

(a) Find [v]g. 

(b) Find [v]c. 

(c) Use your answer from question 24 to calculate 

[Ble [v]g and verify that it equals [v]c. 

Let B = {b,, bo, b3} and C = {ce}, Co, ¢3} be bases for 
a vector space V and suppose v = 2b, + Sb, — 6b;, 
b) = 3c; -— 82 + 5¢3, bo = 6c, — 20. + 9e3, and 
b; = Te, + 3e) — €3. 


(a) Find [v]g. 
(b) Find the change-of-basis matrix [B]¢. 
(c) Find [v]e. 


32. Given basis B = 3 | : | of R? and change-of- 


9 


basis matrix [B]o = 23 


307 } find the bass 


33. Given basis B = {5 + 9t, 2 — St} of P,{R} and change-of- 


1 8 


basis matrix [Blo = 71 | find the basis C. [S]- 


ay 


The last few exercises connect the algebra with the geometry of coordinate systems in R?. The geometry and algebra 
of coordinates in R” are connected similarly. 


34. Let 8 = {b,, bz} and C = {c), ep}. Find [v]g and [v]¢. [S]-320 
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35. Let 8 = {b,, bo} and C = {c), ey}. Find [v]g and [Vv]e. 


36. Let 8 = {b,, bo} and C = {c), ey}. Find [v]g and [Vv]e. 


37. Find the change-of-basis matrix [B]¢ and verify that [B]c [v]g = [v]c in ques 


38. Find the change-of-basis matrix [C]g and verify that [C]g [vlc = [v]g in quest 


tion 36. 


39. Find the change-of-basis matrix [B]¢ and verify that [B]c [v]g = [V]c in ques 
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Answers 


2 
equivalent coordinate vectors The coordinate vector | 7 | means 2 times the first vector of C plus 7 times the 
=3 


1 1 1 6 
second vector of C minus 3 times the third vector of C: 2} 0 | + i 1 | —3 | 1 Similarly, | 4 | means 6 
0 0 1 —3 
times the first vector of & plus 4 times the second vector of & minus 3 times the third vector of &: 67. +4/.2—31.3 
4 
. Finally, | 6 means 4 times the first vector of 8 plus 6 times the second vector of 8 minus 3 times the 
=3° |. 
third vector of 8. Hence 
2 1 1 1 2 7 3 6 
7 =2); 0 ]+7}] 1 |-3}] 1 |=] 0 }4+] 7 |-]|] 3 J=] 4 
=i 0 0 1 0 0 3 3 
and 
4 0 6 0 6 
6 = AI. + 61. = 31.3 =| 4 ]4+] 0 |-3] 0 J= 4 
aa 0 0 1 -3 
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5.3. Orthogonalization [4.6, 5.2] 


As it is in linear algebra, determining the linear combination of basis vectors that sums to a given vector is big 
business in engineering and the sciences. This problem is generally aided by careful choice of basis vectors (see 
Legendre Polynomials, Fourier Series, and Finite Element Methods, for example). But what makes one set of basis 
vectors more amenable than another? To get some idea, suppose we have an arbitrary basis 8 = {b;,b2,...,b,} of 
an inner product space V and an arbitrary vector v in V. The problem stated mathematically is to find coefficients 
a1, 42,...,@, such that 

Vv = a,b; + agb2 +--+ + ayby. (5.3.1) 


If V = R’, this is the vector form of a linear system. It can be solved by row reduction. If V # R", it is less obvious 
how to solve. Using coordinate vectors (see section 5.2) is one way to turn (5.3.1) into a linear system, but doing so 
does not shed light on simplifying the process. Since V is an inner product space, though, perhaps the inner product 
can be leveraged instead. 

Taking the inner product of both sides of (5.3.1) with b; and then bz, and so on through b,, produces the following 
linear system. 


(v, by) = (a,b) + agb2 + +++ + a,by, by) 
(v, bz) = (a,b, + agb2 + +++ + andy, b2) 


(Vv, Dn) = (aby + agb2 + +++ + dabn, Dy) (5.3.2) 
By inner product space properties 4 and 5, we have 


(v, by) = a (by, by) + a2{bz, by) +--+ + ay{ Dn, bi) 
(v, bz) = ay (b,, bz) + a2(b2, bz) +--+ + a, (Dy, D2) 


(V, b,) = ay (by, b,) + anXb2, b,) EE est AnD, b,) 


which in matrix form is 


(by,b;) (by,b,) + (basa) Ta] | (web) 
(b,,b2) (bz,b2) =» (Dq.b2) |] a | | dv. bp) 

es : oe : 7 = é (5.3.3) 
(bi,b,) (bob, «++ (Dasba) JL an | L deb.) 


No matter the vector, no matter the basis, and no matter the inner product space—the problem of writing a vector with 
respect to a given basis is reduced to solving a linear system of equations, a problem that we have studied extensively! 

If that were all there were to it, it would be enough (though no better than using coordinate vectors). Note that 
(5.3.1) is a linear system of n equations in n unknowns, and so is (5.3.3). Is one really better than the other? This 
section began with the promise that careful choice of basis would help. Since the process of row reduction involves 
producing zeros above and below the pivots, starting with some zeros in these entries of the coefficient matrix of 
(5.3.3) would be beneficial. For example, if (b;,b2) were zero, it would put zeros in the 1,2-entry and the 2,1-entry. 
More generally, if (b;,b;) were zero, it would put zeros in the i, j-entry and the j,i-entry. The more orthogonal pairs 
of basis vectors (zero inner products between basis vectors) the better. Thinking greedily, if (b;,bj;) = 0 for all pairs 
i,j,i # j, the system reads 


(bi, b;) 0 ue 0 ay {v, by) 
0 {b2,b2) +: 0 a2 {v, b2) (53.4) 
0 0 ae (Dn, b,,) an (Vv, b,,) 
and has solution a; = eh a) = Boy oe «only = me. Better yet, if (b,,b,) = (b2,b2) = --- = (b,,b,) = 1, the 


solution is simply a, = (v,b,), a2 = (Vv, b2),...,d, = (Vv, b,)—the coefficients are just the inner products of v with 
each of the basis vectors. 
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The question then turns to (a) establishing bases within which pairs of vectors are orthogonal, and (b) possibly 
ensuring all their norms (inner products (b;, b;)) are one. The prototypical example of such a basis is the standard basis 
& = (1.1, 0.2,..., In} with inner product (u,v) = u-~ v (the dot product, after which inner products were modeled). 
Can you verify that the inner product of every pair of distinct vectors in & is zero and that the norm of each vector in 
& is one? Answer on page 182. The standard basis 6 = {1,t, 17} of P2(R) (see page 127) with inner product (4.6.2) 
does not have these properties. Can you identify at least one violation? Answer on page 182. 

The SageMath output below demonstrates a process for taking any basis of R* and modifying it so that all pairs 
of vectors are orthogonal, a process called orthogonalization. 


Basis B: 


w2 = (7, -1, 9) - (4, -4, ®) = (3, 3, 9) 
= (-8, 3, 3) - (-11/2, 11/2, 0) - (4/11, 4/11, 12/11) = (-63/22, -63/22, 21/11) 


0 


"1 
from the SageMath snapshot are orthogonal? Answer on page 181. The first vector of the original basis is taken as 
the first vector of the orthogonal basis. The second vector of the original basis minus a particular vector is taken as 
the second vector of the orthogonal basis. The third vector of the original basis minus two particular vectors is taken 
as the third vector of the orthogonal basis. But what vectors ought to be subtracted? A clever observation will answer 
the question. 

Given any nonzero vectors b, and bz, 


(bo,bi), \_ (bp, bi) 
(bibs = Sb] = (b), bz) — (b. Se) 
= _ (bz, bi) 
= (b1, bz) hi, Bi) (by, bi) 
= (bi, b2) — (b2, bi) 
o (5.3.5) 


so b, and bz — Pe by are always orthogonal (even if b; and bz are not). This applies in any inner product space, not 


just R”. The term Pee by is called the component of b2 in the b; direction or the orthogonal projection of bz onto 
b,, and is often denoted proj,,b2. Subtracting this term from bz removes the component of b2 in the b; direction, 
leaving only the component of b2 orthogonal to b, (not in the direction of b;). In R’, this is seen geometrically in the 
following diagram. 


bz — projp,b2 


proj b; by 
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A set of three nonzero vectors, b;, b2, b3, can be orthogonalized by extending the process to b3. Its components 


in the directions of both b; and by — Peep will need to be subtracted. Letting w; = b; and w2 = bz — projy, be, 


w3 = bs — proj,,,b3 — proj,,,b3. For larger sets, the process continues recursively. 


Wi =b 
w; = b; — proj,,,b; — projy,b; —---— Projy_, bj, Ja 23 sce 3N: (5.3.6) 


The process as described by (5.3.6) is called orthogonalization, or Gram-Schmidt orthogonalization. 


A few details have thus far been overlooked. For one, formula (5.3.6) only works if the denominators (w;, w1), 
(W2, W2), ---s (Wr-1,Wn-1) Of the projections are all nonzero. That is, the vectors W),W2,...,Wn»-1 are nonzero. 
Can you provide an argument that {b;, bo,...,b,} being linearly independent assures w1, Wo,..., Wn-1 are nonzero? 
HINT: Show that if w; = 0 for some j, then {b;,b2,...,b,} is linearly dependent. Answer on page 182. For two, the 
discussion at the beginning of the section establishes that (5.3.1) implies (5.3.3), but what we have been relying on is 
the converse, that (5.3.3) implies (5.3.1). To start resolving this issue, can you show that if (w, b,) = (w,b2) =--: = 
(w, b,) = 0 for some vector w € V, then w = 0? Answer on page 182. This means the zero vector is orthogonal to 
(has inner product zero with) every vector in a basis and in fact is the only such vector. This fact plays a prominent 
role in this discussion. For three, the span of the orthogonalized vectors is the same as the span of the original vectors. 
This is a particularly important feature of the process if you are orthogonalizing the basis of a subspace. See crumpet 
24 for resolutions of these last two issues. 


Crumpet 24: Details of the Process of Orthogonalization 


To show that that (5.3.3) implies (5.3.1), we follow the steps establishing that (5.3.1) implies (5.3.3) backward. The 
same properties of inner product spaces that gave us that (5.3.2) implies (5.3.3) work in reverse, giving us (5.3.3) 
implies (5.3.2). However, the implication from (5.3.2) to (5.3.1) is not as straightforward. Assuming (5.3.2) we need 
to show that v = a,b, + ab) + --- + a,b,. Moving everything to the lefthand side in (5.3.2), 


(v, by) — (a,b, + agb2 + +++ + a,b, b,) = 0 
(v, bz) — (a,b, + agb2 +--+ + a,b, bz) = 0 


(Vv, b,) a {ay b, ate anbz anon AnDn, b,) =0 
which implies 


(v — (a,b; + anb2 +--+ + a,b,), bi) = 0 
(v — (a,b; + anb2 +--+ + a,b,), b2) = 0 


(v = (a,b; oF anb2 aimee ear a,by), b,) =0 


so V — (a,b; + ayby + --- + a,b,,) is orthogonal to each basis vector. As shown in “zero inner products” on page 182, 
this means v — (a,b, + a,b) + --- + a,b,) = 0 and therefore v = a,b, + a2b2 +--- + a,by. This settles issue two. 


To show that the span of a set of vectors is not changed by orthogonalization, we show something stronger: that 
span{W), W2,...,W,} = span{b;, bo,...,b,} for all j = 1,2,...,n. Let W; = span{b;, b2,...,b,}, making W; a vector 
space with dimension j. Each w; is by construction a linear combination of {b;, b2,...,b;} and therefore in W;. Of 
course W; = span{b} C W2 = span{b;,b2} C --- C W; = span{b), bo,..., bj} so Wi, W2,..., W; are all in Wj. If we 
knew that w, W2,..., Ww, were linearly independent, we would have j linearly independent vectors in a j-dimensional 
vector space, making {W, W2,...,W,} a basis for W;. Being a basis, span{w;, W2,...,w,} would equal W; and we 
would be done. 


Direct computation shows that a set of nonzero vectors S = {v,, V2,..., Vp} in which every pair of distinct vectors 
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is orthogonal is linearly independent. Suppose c; Vv; + C2V2 +++: + CpV, = 0 for scalars c),c2,...,¢,. Then 


0=(0,v,) = (crv: + CoV. t-+++ CpVp)s V1) 


= (C1 V1, V1) + (C2V2, V1) +0 + (cpVp,¥1) 


C1 (Vi, V1) + €2 (V2, V1) + +++ + Cp (¥ps V1) 


C1 (V1, V1) 


(see exercise 19a in section 4.6) since (vi, v i) = 0 whenever i # j (every pair of distinct vectors is orthogonal). But 
y, is anonzero vector, so (v;, v;) # 0 (inner product property 2). Hence c; must be zero. Similarly cz, c3,..., cp must 
also be zero. 


It remains to show that every pair of distinct vectors in {w), W2,...,W,} is orthogonal. Every pair of distinct 
vectors in {w;} is orthogonal (vacuously since there are no distinct pairs in the set). By (5.3.5), w; and wy» are 
orthogonal so every pair of distinct vectors in {w,, W2} is orthogonal. Now suppose every pair of distinct vectors in 
{W1, W2,..., W,-1} is orthogonal for some j > 3. If 1 <i <j, then 


(wi, w)) = (wi, bj — proj,,,bj — proj, bj — +++ - projy, ,b;) 
b;,w bj, W2 bj, W)- 
= (wb - Di Wa eo ie  s ee EDS i] 
(Wi, Wi) (Wo, W2) (Wj-1,Wj-1) 


Ba a APA ee LLL ee ee SLC 

= (wo by) (ws: wee wi) (w ew] (ws ee wt) 

s Ce (bj, Wi) = (bj, W2) Serre (bj, Wj-1) ne 

= (wi, b;) (wi) (Wi, Wi) (Ww, W>) (Wi, W2) (WW) (wi, W)-1) 
b,, l 

= (w;,b;) = ee (Wi, Wi) 

= w;,b;) — (w;,b;) 

=0 


since (w;, W.) = 0 whenever k # i. By induction, every pair of distinct vectors in {W;, W2,...,W,} is orthogonal. 


A set of vectors in which every pair of distinct vectors is orthogonal is called an orthogonal set. A vector with 
norm | is called a unit vector. If each vector in an orthogonal set is a unit vector, it is called an orthonormal set. 
If all the vectors in an orthogonal set are scaled to have norm | (are normalized), the orthogonal set becomes an 
orthonormal set with the same span. 


Returning to the original question of writing vectors as linear combinations of basis elements, we see that if 
8 = {b,, bo, ...,b,} is an orthogonal basis, then (5.3.3) reduces to 


(v, bi) 
qj = > 
{b;, b;) 


In words, writing a vector as a linear combination of orthogonal basis vectors amounts to projecting the vector onto 
each of the basis vectors. As a formula, if 8 = {b,, b2,...,b,} is an orthogonal basis and v is an arbitrary vector in 
an inner product space, then 


v= (proj, v) + (proj,,v) tees t+ (proj, v) : 


In terms of coordinate vectors, 


[v]g = 
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Key Concepts 
unit vector vector with norm 1. 
orthogonal set set in which every pair of distinct vectors is orthogonal. 


orthonormal set orthogonal set of unit vectors. 


(Vs) y 


orthogonal projection (of one vector onto another) proj, Vv = au 


orthogonal basis basis that is an orthogonal set. If 8 = {b,,b2,...,b,} is an orthogonal basis and v is an arbitrary 
vector in an inner product space, then 


v= (proj,,v) + (proj,,v) test (proj,,,v) : 


normalize scale a vector to meet a certain criterion. Often this means scaling so the norm is one. 


(Gram-Schmidt) orthogonalization given a linearly independent set {b,,b2,...,b,}, the set {w), W2,...,W,} de- 
fined by 
wi =b 
w; = bj — projy,bj — proj,,bj—---—projy, bj, j= 2,3,...,n. 


has the property that, for j = 1,2,...,n, {W1, Wo,..., Wj} 1s an orthogonal basis for span{b;, bo, ..., bj}. 
orthogonality to a basis The only vector orthogonal to every vector of a basis is 0. 


orthogonal sets and linear independence an orthogonal set of nonzero vectors is linearly independent. 


Exercises 15 1 
(f) S=4] -5 |,} 6 
For all exercises, the inner product space is the dot product on 3 5 
R" unless specified otherwise. ; i6 
; 7 he a 
1. Is va unit vector? If not, normalize it. (g) S = 3 fF 5 || _5 | [S]-321 
@v=[3 4] ayla | L-# 
(b) v= oe a -2 3 24 
| _ 8 ; ! (h) S = 9 |.) -2 |,| 13 
© v=[ 53 w | (8}321 6 4 -11 
@v=[a -w —w | {Tei 18 
-[{ -2 2 1 E @ S= O },/ 8 |,} -8 ],} 10 
(@) v=[-3 3 -} | (A135 o9}i7tL4a lle 
(f) v=| i 2 et 2 | [A]-356 [A]-356 
(g) v= [ Z 2 z 4 | 3. Repeat question 2 in R” with inner product 


2. Determine whether S$ is orthogonal. 


ose ofa | 


osef[2 1] fom 
oof 51-2] 

oof 2}[8115] 
ete 


(U,V) = Wy Vy + 2u2V2 +++ + NUnVy 


where u = [ m Uy °°: Un | and v = 
[v1 ve +++ vm J. [S]-321 [41-356 


4. Is it possible for a pair of nonzero vectors to be orthogo- 
nal in multiple inner product spaces? 


5. In the inner product space P2(R), with inner product 
(;) : Po(R) x P2(R) > R, 


(p,q) = p(O)qO) + pq) + pg), 


which of the following polynomials are orthogonal to 
P21? 
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(a) 0 +6t-7 

(b) 3-1-2 [S]-321 
(c) 6f — 11t+5 

(d) 40 — 8t+3 [A]-356 
(e) 2° —t-3 


6. Let u = [ 2 3 |, v = [4 -3 |, and w = 


[ -6 4 |. Find the orthogonal projection. 


(a) proj,v [S]-321 
(b) proj,w 
(c) proj,w [A]-356 
(d) proj,,u 
(e) projyu [A]-356 
(f) projyv 


Sketch the projection of question 6 and the two vectors 
involved in the projection on the same set of axes. [A]- 
356 


Approximate proj, Vv. 


an 


(a) H 
4 
3 Vv 
2 u 
> 
2 1 £0 1 2 3 4 5 6 7 
= 
[A]-357 
4 
(b) 
> 
8 1 
(c) 
-6 3 
[A]-357 


(@) : 
1 
Vv 
> 
1.0 2 3 4 5 6 7 8 
u 

-2 

3 
9. Redo question 8 approximating proj,u instead. |A|-357 


10. True or false? 


(a) In any vector space, a pair of orthogonal vectors is 
also perpendicular. 


(b) If two vectors in R” are orthogonal relative to one 
inner product, then they are orthogonal relative to 
all inner products. 

(c) The zero vector is orthogonal to all vectors. 


(d) Any five vectors in R° can be orthogonalized (to 
form a set of five orthogonal vectors). 


11. Bis an orthogonal basis of R”". Find [v]g. 


ome 1 bel 
oe 3][ 5 pe[5]o 


of SIL PE 
ed 


18 
P] | [S]-321 


wf ZEN 


me Ww OO 


| 


12. Find an orthogonal set with the same span. 


LHS pom 


5 
(e) 2 |] -4 |? [A]-357 
1 1 
5 -1 
(f) 9} —4 |, 
1 
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ele 


(h) a) SageMathCell 60 


Lanes | 


(i) .)) Sageath Cell 61 


[A]-357 


—2 1 2 2 
-4 3 —2 2 
ota lel bho | eee 
3 1 9 -4 


13. Orthonormalize (normalize the vectors resulting from the 
orthogonalization process as they are computed). 


0 (2a p om 
OLEH al 


som) 


[A]-357 
-8 8 6 
(d) a) SageMathCell 63 -~9 ; 7 : 3 
8 3 5 
14. Is the orthogonal set from question 12 a basis for R” (for 
any n)? [A]-357 
15. Scale the vectors in the orthogonalization of question 12 
to find an orthonormal set. | A\|-357 


16. Find an orthogonal set with the same span as S from 
question 2 and normalize each vector so it becomes an 
orthonormal set. [S]-322 |A]-358 


17. Find an orthogonal set with the same span as S from 
question 2 relative to the inner product of question 3 and 
normalize each vector so it becomes an orthonormal set. 
[S]-323 [A]-358 


18. The given set S is linearly dependent so we should not 
expect orthogonalization to lead to an orthogonal basis 


Answers 


for span$ . Orthogonalize anyway and explain what hap- 
pens. For which j is the set {w), W2,..., W,} (the result of 
orthogonalizing the first 7 elements of S) an orthogonal 
basis for the span of the first 7 elements of S$? 


ms-i3 / fs} 
mols ha] [a] 


of ENGI 


19. Orthogonalize {1, t, PI in P,(R) with inner product 


uo 


io’) 


— 


1 
(fg) = { “fede dx 


Then scale each one so its graph passes through (1, 1). 
The resulting functions are the first three Legendre poly- 
nomials. 


20. Orthogonalize { 1,t, P| in P,(R) with inner product 
(p,q) = p(0)q(0) + p(L)g() + p(2)q(2). 


[A]-358 
21. Explain why an orthogonal set is a basis for its span. 


22. B = {sint, sin 2t, sin 3¢, sin 47} is an orthogonal set in 
C((-a, z]) with inner product 


1 TT 
.e== | Sx)g(x) dx 


and is therefore a basis for its span, W. Let h(t) = t and 
(a) calculate proj,,,,,2 for k = 1,2,3,4; and 


(b) graph h and PIO} gin jh + PIO] gin 27/2 + PLO] in 3h + 
proj,;,,4,/2 on the same set of axes. 


23. Repeat question 22 with 


-1 t<0O 
A(t) = ; 
@ f t>0 


orthogonal pairs The three possible dot products are all zero: 


5 3 
—S }-| 3 
0 9 
5 — 93 


= 5(3) — 5(3) + 0(9) = 0 
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standard basis inner products For pairs of basis vectors, i < j, 


(I. j, 1.,;) H=Lj+1 
- term ais term 
=0-0+---+0-0+ To +0-04+---+0-0+4+ Or +0-0+---+0-0 
= 0. 


For i > j, the same computation holds by symmetry (property 3 of inner product spaces). For the norms of 
basis vectors, 


IIZ:,ll = 


i” term 


—— 
--+0-04+ 1-1 4+0-0+---+0-0 


=I. 


P.(R) violation Violations are easy to come by. The inner products (1, t) and (t, t”) are both zero, but (1, 77) is not: 
(1,7) = (10) + 0 + 0°), 0) + 0) + 1@)) 
2: 2 2 2 
= =(0-1)+ go le 30-0) + 300-0) +20 -0) 


WINN 


2 
None of the norms are one:~ 


1] = dl, 1) = 1) 30-0) + 50-0) + 30-0) + 20-1) 42-1) = v2 


lItll = Vi, = 50: 0)+ 50: 0) + 50. +50, 0) + 2(0-0) = ne 


Ir ll = Vie, 2 - 2 d, +50, +50, 0) + SC 0) +2(0-0) = qe 


orthogonalized vectors are nonzero Suppose w; = 0 for some j. Since each wy; is a linear combination of b,, b2,..., b; 
w; = bj — proj,,,b; — projy,bj —--- — projy,_,b; 
= b; + some linear combination of bj, b2,..., Dj-1 
giving a nontrivial linear combination of b;,b2,...,b,; that sums to 0. This implies {b;,b2,...,b,}, and there- 


fore {b,, b2,...,b,}, is linearly dependent. 


zero inner products Let w = c,b, + cob. + --- + c,b,. Then (w,b,) = (w,b2) = --- = (w,b,) = O means 
c1(w, by) = cow, bz) = +: = C,(w, b,) = 0. By properties 3, 4 and 5 of inner product spaces (see page 154), 
(w, c,b1) = (Ww, cob2) = --- = (w,c,b,) = 0, so (Ww, c)b) +c2b2+---+c,b,) = 0. But c)b) +c2b2+---+c,b, = Ww 


so (w, w) = 0 and by property 2, w = 0. 


2Since 1? = 1, it is just as well to see that the norm squared is not 1 as in |[1||? = (1,1) = 2(0-0) + (1 -O)+ (0-0) + (0: 1)+ 21-1) =2. 
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Figure 5.4.1: Reflection about the line @ is linear 


5.4 Similarity and Diagonalization [3.7, 4.4, 5.2] 


Figure 5.4.1 illustrates that reflection about an arbitrary line through the origin is a linear transformation. As we saw 
in section 4.4, there must therefore be a 2 x 2 matrix M such that T(x) = Mx. In the same section, we also learned 
that the columns of M must satisfy M.; = TU), j = 1,2, so if we knew the images of the standard basis elements, 
we would know M. 

There is another way. Imagine rotating the plane by angle —a about the origin, then reflecting about the x-axis, 
then rotating (back) by angle a about the origin. The line € would first map onto the x-axis, all vectors/points/sets in 
the plane would then be reflected about this image, and then the line and the reflected images would be rotated back 
so that £ would be back where it started. All vectors/points/sets and their reflections would maintain their relative 
positions across @. In the end it would be as though the vectors/points/sets were simply reflected about line ¢. 

The composition of the two rotations and the reflection is easy enough to write down as a matrix transformation. 
Using the standard matrix for reflection about the x-axis (given in the discussion of section 4.4) and the standard 
matrix for counterclockwise rotation about the origin (from exercise 12 of section 4.4), 


cosa@ —sina 1 O cos(—@) —sin(—a@) 
sina cosa QO -l sin(—@) cos(—a@) 
cosa@ —sina 1 O cos@ sina 
“| sina cosa | 0 -l | —sin@ cosa | Aaehel) 
which can of course be calculated to get 
M= cos? a — sin? a 2 cos a sina 
“| 2cosasina —-(cos*a- sin’ a) 
_| cos2@) — sin(2@) 
~ | sina) —cos(2a) |* at) 


Interestingly this form reveals that T can also be described by reflection about the x-axis followed by counterclockwise 
rotation by 2a about the origin. Can you see why? Answer on page 190. Standard matrices for scaling in the direction 
of any line and shearing along any line can be derived similarly. 
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Considering M in form (5.4.1) leads to a deeper perspective. As seen in equation (5.2.1), left-multiplying by an 
invertible matrix can always be interpreted as a change of basis. For example, left-multiplication by the matrix 


cos@ —sina 
sin@ cosa 


> 


the leftmost matrix in (5.4.1), can be interpreted as changing from the basis 
a eae 
sina 
1 0 
Oy 1 iy 


Note that the rightmost matrix of (5.4.1) is P~!. Can you verify this? Answer on page 190. Accordingly, the rightmost 
matrix of (5.4.1) can be interpreted as changing from the standard basis to basis 8. Altogether then, left-multiplication 
by M represents a change from the standard basis to basis 8, then reflection about the first basis vector in 8 (which lies 
along line £), followed by a change of basis from S to the standard basis. The transformation starts with coordinates 
relative the the standard basis and ends with coordinates relative to the standard basis. 

Taking this new perspective allows us to understand general transformations such as 


— sing 
cos @ 


to the standard basis, 


4 4 


e 3)" 


S:R OR’, sw =| 


geometrically by writing the matrix as a product PAP~! where the action of A is easily understood. In particular, 
matrices A of the form 
a 0 
Lo 


(diagonal matrices) are easily understood geometrically. They can be writen as the product 


1 0 a 0 
fo alle oh 
which according to section 4.4 is the standard matrix of scaling by a factor of @ in the direction of the first basis 
vector followed by scaling by a factor of @ in the direction of the second basis vector. If @ is negative, it incorporates 
a reflection about the second basis vector, and if 6 is negative it incorporates a reflection about the first basis vector. 
Summarizing, supppose we have the standard matrix M of a linear transformation from R” to R” and we want to 


better understand M by finding a matrix P and a diagonal matrix D such that M = PDP™'. Right multiplying both 
sides by P we require MP = PD. The left side of this equation can be written as 


[ MP.1 MP.. --- MP.p 
and the right side can be written as 
[ DP: DapP-2 PS DanP:n |; 


Equating columns of the two sides, we get 


MP.) = Di iP: 
MP. = D22P.2 
MP. = DanP:n- 


The columns of P must be eigenvectors of M and the entries of D the corresponding eigenvalues! 
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Figure 5.4.2: Visualizing the action of S 


In the particular case of S(u) = | 6 » Ju the eigenvalues are —5 and 5 with corresponding eigenvectors 


— | and | 2 | respectively. Can you provide the calculation? Answer on page 190. It follows that 


Por lY Sie sha 3] 


and we see that S has the effect of reflection about the second vector of the basis C = {| 1 | = } plus expansion 


-5 0 

0 5 
has the effect of reflection about the second basis vector (rooted at the origin) plus expansion by a factor of 5. As with 
reflection relative to the standard basis, the reflection about the second basis vector occurs parallel to the first vector. 
Because the standard basis vectors meet at a right angle, reflection is done along lines perpendicular to the axes. If 


the basis vectors meet at a different angle, reflection is done along lines meeting at that same angle. 
This analysis can be verified by plotting a couple of points and their images under S$. For example, 


(PSHE SHEDR DE] 


Figure 5.4.2 shows the geometry of S with respect to both the standard basis and the basis C. The purple grid shows 


by a factor of 5. This is because multiplication by 


6699 


coordinates with respect to C. The 4 | direction is the positive “x” axis (first basis vector) and the 3 direction 


is the positive “y’ axis (second basis vector) in these coordinates. Notice the point - 


1 


| has coordinates 0 


. | with 
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respect to the purple grid. In other words, 


They are the same point in the plane! Coordinates with respect to C make it easy to trace the action under S. 


| 

9 Ie 

is first reflected about the second basis vector, making its coordinates Fy | (with respect to the purple grid) and 
ce 


then expanded by a factor of 5, making its cordinates 7 | (with respect to the purple grid). Other points can 


3 


. | with respect to the standard basis, but} >, | with 
5 


respect to C (marked as P in figure 5.4.2). Can you verify this? Answer on page 190. Reflection across the second 


be understood similarly. The coordinates of point P are | 


basis vector gives it coordinates (shown in figure 5.4.2 connected to P by an orange dashed line segment 


Cc 


| 
WI BUG 


6699 


crossing the “y’” axis parallel to the “x” axis) and then expanding by a factor of 5 gives it coordinates se | . This 
Cc 


point is marked as S(P) in figure 5.4.2. What are the coordinates of S'(P) with respect to the standard basis? Answer 
on page 191. 


Similarity 
Separating the matrix calculations of the preceding discussion from their geometric interpretation, we have been 
examining matrices A and B related by the equation 


B=P'AP, 


a relation known as similarity. Indeed, matrices A and B are called similar if B = P~'AP (or equivalently PBP~! = 
A). Consistent with the name, matrices related this way share a number of similar features. 


Theorem 14. [f matrices A and B are similar, then A and B have the same (i) determinant, (ii) eigenvalues, and (iii) 
rank. 


(i) By theorem 8 and the fact that det P"' = <4, (section 3.7), det B = det(P"'AP) = det P"! - detA - det P = 
Pera - det A - det P = det A. 


(ii) For any value A, 


det(B — AI) = det(P™'AP — al) 
= det(P"'AP — P"'(ANP) 
= det(P7!(A — ANP) 
= det(P™!) - det(A — Al) - det P 
= det(A — Al) 


so the characteristic equations of B and A are equal making the eigenvalues of B and A equal. 


(iii) By exercises 20 and 21 of section 5.1, neither right-multiplying nor left-multiplying a matrix by an invertible 
matrix changes its rank, so rank of A, which equals the rank of PBP~', equals the rank of B. 


Other similar features of similar matrices are explored in the exercises. 
Another important feature of similar matrices is that powers of similar matrices are similar. That is, if A and B are 
square and similar, then A‘ and B* are similar, where the k” power of M is defined by 


k times 


a ip 
M"=M-M.:--M 
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(analogous to powers of real numbers). To see that this is true, write A = PBP~' and compute 


A* = (PBP™')* 
k times 
oT OOO 
= PBP'. PBp-!... PBP™! 
k times 
—1 
= PB-B---BP 
= Pptp! 


This is particularly useful when A is similar to a diagonal matrix. In this case, A = PDP™' for a diagonal matrix D 
and 


A‘ = PD‘P™! 
fig: o 
7 0 Dao 0 aa 
0 0 De 
Di, 0 0 
: k 
- 0. Dp» 0 oi 
0 0 uae De 


so the difficulty in raising A to any power is commensurate with the difficulty of raising D to that power. Earlier we 
found that 


a]? Sie stl ay 
if tes spell a 
_ —4375 —2500 


3750 = 4375 


While that may not be the most pleasant computation, it certainly beats computing 
ee ee eee | ee ae | eee | ee | ee 
6 #67 “| 6 7 6 7 6 67 6 #7 6 #67 


directly. This property of similar matrices is at the heart of the power method for estimating eigenvalues (section 6.2), 
which is at the heart of Markov chain problems (section 7.2). 


When a matrix M is similar to a diagonal matrix D, we say that M is diagonalizable and that the matrix P of 
P-'MP = D diagonalizes M. We saw earlier that if P-' MP = D then the columns of P are eigenvectors. We shall 
now add the observation that the eigenvectors (columns of P) must be linearly independent, a requirement for P to 
be invertible. On the other hand, if M is an n X n matrix admitting n linearly independent eigenvectors, then M is 
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diagonalizable and P, a matrix whose columns are n linearly independent eigenvectors of M, diagonalizes M: 


P"'MP = Pp [ A, P41 ArP:2 sake AnP:.n 
aA, O 0 
0 aA 0 
7 a! 

0 O An 

a, O 0 

0 A 0 

0 O An 


which is diagonal. Altogether, an n x n matrix M is diagonalizable if and only if M has n linearly independent 
eigenvectors. 


Key Concepts 


similar matrices matrices A and B are similar if there is an invertible matrix P such that A = P~'BP (equivalently 
PAP" = Bor PA = BP). 


similarity Matrices that are similar have similarity. Matrices that have similarity have the same (i) determinant, (ii) 
eigenvalues, and (iii) rank. 


diagonalizable a matrix that is similar to a diagonal matrix. 


diagonalizability an xn matrix M is diagonalizable if and only if M has n linearly independent eigenvectors. Such 
M is diagonalized by any matrix whose columns are n linearly independent eigenvectors. 


powers of matrices The k’" power of matrix M is deifined by 


k times 


K * < 
M*=M-M.:--M. 


similarity and powers The k” powers of similar matrices are similar. 


geometry of diagonalizable matrices If M is a diagonalizable 2 x 2, respectively 3 x 3, matrix, its action on the 
plane, respectively space, can be understood as a scaling (and possibly reflecting) transformation relative to a 
basis of eigenvectors. 


-l 
1 


[A]-358 
1. Find a matrix P that diagonalizes M and calculate (the 
diagonal matrix) P-'! MP. Eigenvectors of M are given. 
5 4 > I -12 100 30 2 2 
(a) M= | | 5 l| 9 | [S]-323 (f) M=] 75 13 30 |;] -3 |] O |, 
-75 -100 -117 


Exercises | 0 


_[ 2% 32 ][ 8 4 1 
v3 SUSI | 
1 =2 a i =i 
(c) w=| Oa | 4 \| ; | [A]-358 
ee : : 11 -20 30 2 5 
4d ual 7 | (2) M = 8 <i7 36- |S) [oh 
“ & ala Hod 4 -10 18 1 0 
-200 0 0 0 4 28 
(ec) M = a ae ae ae ee 1 | [S}-323 
-19 16 0 > 3 2 
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-17 10 6 2 3 2 5. Find P # 0 such that PA = BP for the matrices of ques- 

(h) M=} -7 2 3 45) 1 4.) 3 4] 1 tion 4. [S]-324 [A]-358 
-7 5 O 1 2 3 : Sides oe 

6. Matrices A and B are similar. Find j and k. 
-39 -28 -52 -20 F 
j -70 3 -4 
-102 -91 -171 -48 (a) A=| I B-| 
i = : k 38 1 -8 
wee 68 56 107 32 - 
62 56 99 37 ) =| i y || -9 | 
0 1 3 2 —2 =k -12 3 
3 3 1 —3 7. Find P such that PAP™! = B for the matrices of question 
, , ; [A]-358 

-2 -2 -2 0 6. 
1 2 -l 1 8. Explain why A and B are not similar. 

2. Find the eigenvalue(s) of the matrix M of question 4 -6 -1 |. 5 -10 

1. [S]-324 [A]-358 GAe | =o | -4 8 

3. Is M diagonalizable? All of its eigenvalues are given. [S]-325 
—2 -8 -4 -2 
ool? thelt 2 
(a) u-| a . jar [A]-358 5 -l 7 6 
ore 37 9 B= 1 -2 
61 4 4 1 -4 9 

(b) M= 51 

-25 41 
@ A=| 5 oe oa 

() w=| - ze jnaw 8 2 7 -14 
= 9. What does the SageMath code do? 

(d) M -| : : | 8,3 [S]-324 # Variables 

var(’a,b,c,d’) 
18 -8 14 # Similar matrices 

(ec) M=| -8 6 -7 4|;-6,2 [S]-324 A = matrix(2,2,[9,-6,5,-8]) 

-32 16 -26 B = matrix(2,2,[106,-132,84,-105]) 
4 18 B # Unknown matrix 
alg % 4 “ene P = matrix(2,2,[a,b,c,d]) 

(f) M= 8 >A 10 ; 10, print (A) ;printQO 
7 print (B) ;print() 

15 14 18 # Solve for P and print solution 

(g) M=| 26 3 18 11,-11 syseq = (A*P-P*B) .coefficients() 
-26 -14 -29 print (solve(syseq, [a,b,c,d])) 

9 2 2 10. Find all matrices P such that A = PBP™'. 

(h) M=| 3 9 =I 435,10 [A]-358 = 
1307 @ ORD «1 -| 7 She 
ee 2 | [S}-325 

(i) M=| 68 -78 106 |; 28,-42,-56 81-3 
76 -32 60 

AiR) Seae lath Coll a | : ; | Bae 
—298 79 359 —32 - 
ae 864-791 -1797 726 |. | 237 64 | 
0 > -800 631 1557-590 |’ ~863  -233 
—464 346 926 -616 5 6 7 
-74, 370, -296, -148 OM) Sagettath Coll Fa z =6§ ee 
4. Matrices A and B are similar. Find k. . a 
8862 1601 -—33246 

5 | i 3 | -| 167k | -6634 -1171 24959 
11 10 -146 -745 2053 372 ~7699 
5 2 —3 2 [A]-359 

) =| -4 3 je-| 0” k | [S]-324 30 #7 
ee a (¢) QEERETSD 67 4-| 6 -2 -2 | B= 

@a-| 35 i feel g | 5 1 -6 

2239 -676 16323 
2 1 3 5 —56 46 —420 
@ a=| 5 see 2 | carsss -315 94 2296 
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11. What is the dimension of the solution space in question 
10? [A]-359 


12. Describe the action of the transformation T(v) = 
P-!MPv as simply as you can in geometric terms. 


1 0 .pa| cosa sina 
0 O/ | -sine cosa 
1 O 1 0 

. jP=| = s [A]-359 
2; 

0 

9 


cosa sina 


wo 


—sin@w cosa 


-1 0], [1 3 
@ m=| 5 i fP=| g | 


13. Calculate M’ using a diagonalization process. Eigenval- 
ues of the matrix are given. 


-7 -10 
()M=| 5 f\-23 
aT ges 
(d) u-| ey | 4,8 [A]-359 
20 20 


14. Justify the claim. 


(a) Similar matrices have the same trace (sum of the 
entries on the main diagonal). 


(b) Similar matrices have equal eigenspace dimen- 
sions. 


Answers 


reflection about £ The standard matrix for reflection about the x-axis followed by counterclockwise rotation by 2a 
about the origin is 


cos(2@) 
sin(2a@) 


— sin(2q@) 
cos(2a@) 


1 0 


gar | 


inverse of P It is easiest to verify that the product of the two matrices is the identity: 


2 


cosa sing cos? a + sin? a 


—sina cosa 


cos@ —sina 
sin@w cosa 


cos @ sina — sina cosa 


cos? a + sin? @ 


sin @ cos @ — cosa sina 


eigenvalues and eigenvectors The characteristic equation is 


—J-A =A 

| 6 7-al=O7-OT-4-CHO) 
= -49 + 2° +24 
= 47-25 =0 


so the eigenvalues are —5 and 5. Eigenvectors are found by reducing the matrices 


ie -4 


6 12 and 


=12 =4 
6 2 


—2 -4 d -12 -4 

0 o | ™ 0 0 | 
which gives eigenvectors of the forms x = —2y for A = —5 and y = —3x for A = 5. Choosing eigenvectgors 
with small integer entries, we take eigenpairs 


to 


coordinates with respect toC The change of coordinates matrix from the standard basis to C is 


point P has coordinates 


ee 
I ot 

no 
Io] 

MINT 
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coordinates with respect toC As can be seen from figure 5.4.2, the coordinates of S (P) are : , consistent with 
E 
: —2 2 
(5.4.3), which shows S 3 =| 9 
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Part II 


Applications 
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Mathematical Applications 


6.1 LU Factorization [3.4, 3.6] 


Humans and computers bring very different skills to task. Computers, as the name would suggest, are famously adept 
at computing. Handling fractions, square roots, irrational constants, and | + 2 are all the same to a computer. Humans 
tend to balk at non-integer computations and lengthy ones too. Ask a human to execute five, six, or twenty numerical 
operations, even integer operations, to solve a single problem and they might think you were simply asking too much. 
Ask a computer to do the same and you’ll have your answer in just a few milliseconds. 

Computers supply speed and accuracy to what would be tedious and error prone for a human. Computers make 
the otherwise impractical practical. Linear systems with a hundred, thousand, or even a hundred thousand equations 
are easily within the practical limits of computers. Even a home computer can handle a few hundred equations. 
However, at that size, efficiency plays a major role in practicality. An efficient algorithm may be able to handle a 
certain computation in a few seconds while an inefficient one may take a few days for the same. Humans provide 
appropriate and fast algorithms that would, at least until AI makes some huge advances, be impossible for a computer. 

What makes a good algorithm is first and foremost accuracy. If the algorithm does not conclude with the correct 
answer, it is of no use at all. What makes one correct algorithm better than another is speed, measured by comparing 
the number of computations needed to complete the computation with common functions such as n? or 3” where n is 
some measure of the “size” of the problem. Such a function gives a good idea how the time it takes to solve a problem 
grows as the size grows. To get a sense of the computer’s speed in solving a linear system, let n represent the number 
OF eqnanene to be solved. The row peaucao’ ea of section 2.2, also known as Gaussian elimination, requires 
> +n? - 3 multiplications/divisions and 4 = +3 > F 2 additions/subtractions ([4] section 6.1) to execute. Table 6.1 
lists the number of arithmetic operations requited to compile reduced row echelon form for a general linear system 
with n equations in n unknowns. For a human with a handheld calculator, systems with 2 or 3 variables are doable 
(n = 2 orn = 3). Systems with 4 variables would be tedious in general, but practical with a few strategically placed 
zeros. However, somewhere between 5 and 10 equations we find the limit of human practicality for general systems. 


Table 6.1: Arithmetic operations required for row reduction 


equations, n x/+ +/- total ay 
2 6 3 9 S+4 
3 17 11 28 18 
4 36 26 62 42 +} 
9 321 276 597 486 
51 46, 801 45,475 92, 276 88, 434 
102 364, 106 358, 853 722,959 707,472 
501 42, 168, 001 42, 042, 250 84,210,251 83, 834, 334 
2,001 2,674,672,001 | 2,672,669,000 | 5,347,341,001 | 5,341,337,334 
10,002 | 333,633, 410,006 | 333,583, 385,003 | 667,216, 795,009 | 667,066,746, 672 
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You might well ask whether systems of 500 equations in 500 unknowns are practical even for a computer. Their 
solution by Gaussian elimination requires over 83 million computations! Even if the computer does one computation 
every microsecond (millionth of a second), solving a system of 500 equations in 500 unknowns will take about 83 
seconds. Depending on how quickly results are needed, this may or may not be practical. The same computer would 
take over 5, 300 seconds (about an hour and a half) to solve a system of 2, 000 equations in 2, 000 unknowns and over 
666, 000 seconds (over 7 and a half days!) to solve a system of 10,000 equations in 10,000 unknowns. Of course 
faster computers or clusters of computers could be put to the task to speed up the computation, but no matter how 
much computing power is supplied, there will always be a less-than-astronomical size outside its practical limits. 

A better option than more computing power is to streamline the algorithm. The numerical “speed” of Gaussian 
elimination is approximately proportional to n>. The rightmost column of table 6.1 illustrates this point. The number 
of computations can be well approximated by 373 for large n. In the parlance of numerical analysis, one would say 
the algorithm executes in O(n*) time, read “big-oh of n° time”. The implication is doubling the size of the problem 
multiplies the time it takes to execute by 8 and more generally increasing the size by a factor of k increases its 
execution time by a factor of k?. 

If the algorithm executed in, say, O(n”) time it would reduce the number of computations needed to solve large 
problems by several orders of magnitude. For example, an algorithm that required approximately 4n” computations 
would need “only” about 1,000, 000 computations to solve a system of 500 equations in 500 unknowns (compare this 
to the 83 million for Gausian elimination); about 16,000,000 for a system of 2,000 equations in 2,000 unknowns 
(compare this to the over 53 bilion for Gaussian elimination); and about 400 million for a system of 10, 000 equations 
in 10, 000 unknowns (reducing the estimated time of 7.5 days to about 6 minutes!). 

Now imagine you have to solve the system multiple times for multiple sets of constants, but the same coefficient 
matrix, something that happens frequently in industrial applications. 7.5 days is preferable to 75 days for solving the 
system with ten different sets of constants. This is the approximate effect and purpose of using LU factorization— 
solving large systems for multiple sets of constants. The factorization itself takes about the same effort as Gaussian 
elimination, but subsequent solutions take O(n’) time. 

The efficiency gain is due to turning the general problem into a special case that is much quicker to solve. As 
noted earlier, asking a human with a handheld calculator to solve a system of 4 equations in 4 unknowns borders on 
the impractical as it requires as many as 62 individual arithmetic operations. But asking a human to solve a system 
of 4 equations in 4 unknowns where the coefficient matrix is upper triangular is well within reason. You might have 
a go at solving the system 

-14 0 -10 12 w 
0 9 -8 4 x |_| -2 
0 O -1I5 -6 y | | 6 
0 0 O 10 Zz 15 


to see that it takes 26 arithmetic operations—still not terribly appealing but much better than the 62 for Gaussian 
elimination of a general 4-equation, 4-variable system. A similar reduction is achieved when the coefficient matrix 
is lower triangular. These observations are at the heart of the LU factorization (lower-upper factorization) algorithm. 
It requires many fewer computations to solve two linear systems, one with an upper triangular coefficient matrix and 
the other with a lower triangular coefficient matrix, than it does to solve a single general linear system. In general, 
solving a system with either an upper triangular or lower triangular coefficient matrix can be done in O(n”) time. The 
number of computations needed is approximately proportional to n’. 

To review, if we could factor a general coefficient matrix M into the product of a lower triangular and upper 
triangular matrix, we could achieve such efficiency. Solving Mv = b directly by Gaussian elimination requires O(n") 
operations while solving LUv = b “twice’”—first to find Uv by solving Lw = b and second to find v by solving 
Uv = w—requires only O(n”) operations. 

The process for factoring M into LU essentially amounts to saving your progress in executing Gaussian elimina- 
tion to the point where M has first reached echelon form. As we did several times in sections 3.6 and 3.7, we will rely 
on the fact that row reduction can be expressed by multiplication by (invertible) elementary matrices. If we record 
the row operations (elementary matrices) that reduce M to echelon form, we have 


Ep-+*E,xEiM =U 


where U (the echelon form of M) is upper triangular. Assuming no row swaps have been done, the product E,,--- Ey EF, 
is lower triangular since all row replacements are done by adding a multiple of one row to a row below it (and row 
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scaling does not affect the locations of zeros). By the same reasoning the inverse of E, --- EE, is lower triangular, 


so setting L~! = E,-++ EE, we have 
M = (E,:++E,E\)'U = LU. 


It is not realistic to expect that reduction can be done without row swaps, however. If, for example, the 1, 1-entry 
of M is zero and there is at least one nonzero entry below it, there is no choice. The first operation of row reduction 
requires a row swap. This means M cannot always be factored into a lower triangular times upper triangular product, 
and allowing row swaps is necessary. In the end, a factorization that includes row swaps does not factor M into a 
product LU. Instead, it factors M into a product PLU where P is a permutation matrix, a matrix that holds the same 
rows as the identity matrix but in a possibly different order. Including P in the computation this way does not add 
any arithmetic operations to the algorithm. It adds only swapping of values, a computer operation that is so fast as 
to be insignificant compared to arithmetic operations. Allowing row swaps in an LU decomposition is known as LU 


decomposition with partial pivoting. 


An Example 


To illustrate the method, we will factor (factorize, or decompose) 


18 -35 -4 -56 
-14 21 4 42 


MEN Gs: eye ed a0 
6 -9 -—2 -18 
operations result 
6 -9 -—2 -18 
-14 21 4 42 
Mi. @ Ma, 6. =F 1) 93 
18 -35 -4 -56 
2 -3 -% -6 
-14 21 4 42 
HE 
Mc? a 6 ie) 225 
18 -35 -4 -56 
3-2 _ 
M>, => 7M, + M>,; : . 3 
M3. > —3M,. + M3, 0 2 ; 5 
M4; = —9M,.. + M4, 0 -8 2 2 
2 -3 -% -6 
0 2 1 -5 
MM), ed M3. 0 0 -3 0 
0 -8 2 -2 
2-3 -% -6 
QO 2 1 —5 
Mg, ed 4M). + Mg, 0 0 -2 0 
0 O 6 -22 
2 -3 -% -6 
QO 2 1 —5 
Ms, md 9M3. + M4; 0 0 -2 0 
0 O O -22 


We have reached echelon form, so we have identified U (the echelon form itself): 


2 -3 -% -6 

0 2 1 -5 
Pr WsOe 56 -- 0 

0 0 -22 
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and the product of the 8 elementary matrices corresponding to the 8 row operations form PL. Apply the inverse of 
each row operation, in reverse order, starting with the identity matrix: 


operation inverse 
M4, = 9M. + M4, M4, > —9M3. By Mx, 
M4. =? 4M). + M4. M4. = —4M). a M4. 
M2. o M3. M2. 2 M3. 


Msg. = —9M,. + Msg. Msg. = 9M; : + Ms. 
M3. > -3M,. a M3. M3. = 3M. + M3. 


M), > 71M,., + M2, M2, > —7M,. + M), 
M;; = (1/3)M, ; M;; = 3M. 
M,, ad M,,; M,,. ad Mg, 
In full detail: 
1 0 0 0 10 0 O 1 0 0 0 1 0 0 0 1 0 0 0 
OE Oe eT Oe Oe OP Oy oO OE Oe a TD 
0 0 1 0 0 0 1 =O 0 0 1 0 0 1 0 0 0 1 0 0 
000 1 0 0 -9 1 0 -4 -9 1 0 -4 -9 1 9 -4 -9 1 
1 0 0 0 1 0 0 0 3 0 0 O 9 -4 -9 1 
(9 9 TF OF 7-7 0 1 OF) -7 O 1 OF | -7 0 1 OL, 
3 1 0 0 3 1 0 0 3 1 0 0 3 1 0 Of” 
9 -4 -9 1 9 -4 -9 1 9 -4 -9 1 3 0 O0O 0 
At this point, we have 
9 -4 -9 1}][2 -3 -2 -6 
-7 0O 1 0O 0 2 1 -5 
M=! 3 1 0 offo 0 -2 0 } 
3 0 0 0 0 0 O -22 


which, as expected, is not lower triangular times upper triangular. The final step is to permute the rows of M. The 
permutation matrix P~' can be constructed by applying only the row swaps to the identity matrix, in the same order 
in which they were applied during row reduction. In this case, that means M,. < My, and then Mz: @ M3..: 


0 0 0 0 0 0 
0 0 - 0 0 = 0 0 
1 0 0 1 0 1 
0 1 1 0 1 0 


cooro 


1 0 0 1 
0 1 1 0 
0 0 0 0 
0 0 0 0 
In this case the order of the swaps does not matter, but when an index is repeated within the set of swaps, the order 
will matter. Finally, we have 


3 0 0 O][2 -3 -2 -6 
3 1 0 O}}0 2 FT -5 
—1 = = 
MO ee Ola: GO a2 6 
9 -4 -9 1]10 0 -22 


or M = PLU. Can you verify this? Answer on page 200. 

Therefore the system Mv = b is equivalent to PLUv = b. Solving amounts to first applying P~! to b, permuting 
its entries, yielding LUv = P~'b; second, solving Lw = P~'b, which yields w = L~! P~'b; and third, solving Uv = w, 
which yields v such that Uv = L7'P™'b. 

When M is invertible, U will be invertible, and we will have v = U~'L~'Pb = M™'b, but the method works 
even when M is noninvertible. Whenever Uv = w is consistent, its solutions will also be solutions of Mv = b (and 
whenever Uv = wis inconsistent Mv = b will also be inconsistent). 
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Key Concepts 
factorization factoring a matrix into a product of matrices. 
decomposition factorization. 


LU factorization factoring a matrix into a product of a lower triangular matrix (Z) by an upper triangular matrix 
(U). 


partial pivoting allowing row swaps in LU factorization. In this case, the algorithm factorizes PM into LU for some 
permutation matrix P. 


permutation matrix a matrix containing the same rows as an identity matrix but in a possibly different order. Such 
a matrix will have exactly one | in each row and each column and zeros elsewhere. 


LU advantage solving a system in O(n*) time, subsequently solving systems with the same coefficient matrix but 
different constants in O(n’) time. 


Exercises =) 3 
(c) M=| 6 -10 | [S]-326 
1. Provide an LU factorization of M (no row swaps during 8-5 
row reduction). 

4. -7 
_| 3 > (4d) M=| 8 -2 
(a) w=| -9 -14 [A]-359 6 12 

5 -6 8 -10 -4 

ee -10 14 | (ec) M=] -1 2 0 | [A]-359 
-12 12 1 


—2 -6 
5 35 | [A]-359 


() M=| -7 3 -4 


10 1 -10 | 


21 =f | 5 -1 7 


E 
: 3. Use the LU decomposition to calculate the determinant 
| : a 7 | [S]-325 of M in question 1. [S]-326 [A]-359 
4. Use the LU decomposition to calculate the determinant 
_| -18 -36 -12 of M in question 2. [S]-326 [A]-359 
Chih= 12 29 -12 | 


5. LU factorizations are not unique. Redo question | with- 


-25 10 out using row swaps or row scaling, thus producing a fac- 

(g) M=| -20 —-13 | [A]-359 torization where L has 1s on its diagonal. [S]-326 [A]- 

5 -8 359) 

-42 ~-7 6. LU factorizations are not unique. Redo question 1 (with- 

(h) M=| 24 16 out using row swaps) but use row scaling so that all the 
24 26 nonzero diagonal entries of U are 1. [A]-359 
6 8 4 7. Use the fact that M = LU to help solve the system 

@ M=| 6 4 = -4 | [A}-359 Mv =b. 
9 _ 0 = 

s @ M=| % ee je=| a 

21 -7 -35 

(i) M=| 3 37, «Oo u=| 6 7 fe=| “6 | 
= a4) 210 ve A "e 

: : : ; -5 1 1 O 

2. Find a permutation matrix P # J and an LU factoriza- (b) M= 2) 2 ay oe 4 1 : 
tion of PM. Use partial pivoting (at least one row swap 5 1 73 
duri duction). =| =| 3 
uring row reduction) U | 0 6 fiw | 14 | [A]-360 
(a) u=| 7 = | [A]-359 (c) M= 4 7 a 1 0 
“| 4 27?" "7. 17 

-10 -9 4 7 -2 

@ u=[P 2] ufo 4s be] 2 
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2 
-4 


0 m-| 
4 


Answers 


LU-factorization verified 


P'M= 


ee) 
ee a =) 


and 


so 


0 
6 


1 


ooroeo 


(LU), 
(LU). 


(LU)3, 


(LU)4 


4 


| [S]-326 


er) 


. Redo question 7 solving Mv = b directly (without using 


L or U) instead. Compare the labor involved in the two 
methods. 


. Find the inverse of the matrix M of question 7 by com- 


puting U-'L!. [A]-360 


. Suppose M is an n X n matrix with decomposition LU 


and L has 1s on its diagonal. Count precisely the number 
of arithmetic operations (additions, subtractions, mul- 
tiplications, and divisions) needed to solve the system 
Mv b using the LU decomposition. It will be a 
quadratic function of n. 


. Rank factorization. Suppose M is an m x n matrix of 


rank k < min{m,n}. Argue that there are matrices P, C, F 
where PM = CF; P is a permutation matrix; C is mx k; 
and Fiskxn. 


21 
—35 


4 
=i 


CoCoon 


-6|=|6 -9 -2 -18 | 
-6|+1[/0 2 1 -5] 


-1 -23 | 

=-7[|2 -3 -} -6|+1[0 0 -3 o| 

| -14 21 4 42 | 

,=9[2 -3 -6|-4/0 2 1 -s| 
-9| 0 0 o]+1[0 0 0 -22] 
=[ 18 -35 -4 -56 | 


2 
3 
2 
3 
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6.2 The Power Method [3.5] 


Students typically learn about the quadratic formula in high school or earlier. It provides a formula for the solutions 
of the equation 
ax’ +bx+c=0 


(the roots of f(x) = ax’? + bx +c) for any values a, b, and c. You may remember it as 


—b+ Vb? — 4ac 


C= 
2a 
The formula can also be written as 
1 
—b+ (a — 4ac)’ 
y= —__-___~.. 
2a 
which combines the coefficients of the equation using addition, subtraction, multiplication, division, and rational 


exponentiation. 
What you may never have seen are the more elaborate formulas for solving the general cubic, 


a3x° + ax +a;x+d) =0 


and the general quartic, 
ayx4 + a3x° + ax? +a,xX+d) = 0. 


Crumpet 25: Cubic and Quartic Formulas 


The formula for solving the general cubic polynomial is most often credited to Gerolamo Cardano as he is the first 
known to have published the result, in 1545. However, history reveals that Niccolo Tartaglia knew of the solution 
before Cardano, and Scipione del Ferro, who died circa 1526, before him. In the same work where Cardano published 
the solution of the cubic, Ars Magna, he also published the solution of the quartic. History has been kinder to the 
original solver of the quartic, however, most often crediting the result to Lodovico Ferrari, a student of Cardano. 


Though the formulas are much more involved than the quadratic formula, each one uses nothing more than 
addition, subtraction, multiplication, division, and radical exponentiation. As a result, any solution of any polynomial 
equation up to degree four can be written down explicitly and exactly. 

A famous result of Galois theory is that there are unsolvable fifth degree polynomials (quintics), and unsolvable 
polynomials of all higher degrees as well. Their roots have no closed form expression using addition, subtraction, 
multiplication, division, and rational exponentiation of its coefficients. Their values cannot be written down exactly 
using traditional operations. The modest-looking f(x) = x° — 10x + 2 is a classic example. Its roots cannot be written 
down exactly and explicitly. However, being a fifth degree polynomial with real coefficients, it must have at least 
one real root. For large negative values of x, say —100 or —1000, f(x) is negative and for large positive values of x, 
say 100 or 1000, f(x) is positive. Somewhere in between, f(x) must be zero (as required by the intermediate value 
theorem). Despite the impossibility of a formula, SageMath and other computer algebra systems can still find roots 
of any polynomial. But how can the roots of a 5“ degree polynomial be found if there is no formula? 

As discussed, there is no exact, explicit formula. There are, however, many methods for approximating such roots, 
each one capable of determining the roots to arbitrary precision. Getting back to finding the roots of f(x) = »°-10x+2, 


if we set xo = 2 and let x4; = x; - Z wo, we can generate a sequence of approximations that get more and more 
. _. _ fo) _5_ _f2) _9 _. _ fa) _9_ f@) 

accurate as we proceed. x; = x9 34-10 = 2 5010 = 5 and x7 = x; 5-1 3 SQ) ~ 1.731847. Rounded 

to six decimal places each, the sequence xo, x1,... begins 


2, 1.8, 1.731847, 1.724390, 1.724306 
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and 1.724306 is a root of f accurate to six decimal places.! 


This is an example of an iterative routine. A starting value is given (2 in this case) and a recursive formula 
fi) 
5x?-10 
calculate the next. As more and more iterations are calculated, the output approaches the desired quantity. There are 

a number of iterative routines for linear algebra problems too. 


(xis =x- is applied to generate the rest of the list. The output of one iteration is input into the formula to 


Let 


T : Pere ‘ 
vo = [ -1 0 | and v;4; = Mv;. With the initial value vo and recursive formula, v;.; = Mv;, we can compute a 
sequence of vectors: 


vi= Mv =[-7 4] 

v= Mv, =[-17 8] 

v3 = Mvp =[ -55 28 |’ 

vs = Mv; =[ -161 80 |’ 

vs = Mv, =| -487 244 | 

Vo = Mvs =| -1457 728 | 

v7 = Mve =[ -4375 2188 |’ 

vg = Mv, =[ -13121 6560 |’ 
Vo = Mvs =| 39367 19684 |" 
Vio = Mvy =[ -118097 59048 |" 


Though it is likely not apparent, something remarkable is happening here. Each vector is closer than the last to a 
vector of interest. Plotting the vectors helps reveal the phenomenon. Figure 6.2.1 shows eleven lines, one in the 
direction of each v;. The head of each v; is marked with a point though it is difficult to see since Vo, V1,..., V6 
are nearly on top of one another. Nonetheless, the figure illustrates what is happening. The slopes of the lines are 
converging (approaching a particular value). The last three lines, {rvg : r € R}, {rvo : r € R}, and {rvjo : r € R} all 
seem to lie more or less on the line y = —5x. That is, they essentially lie in the direction of [ —2 1 | . Further 
iteration will reveal more of the same. 


What if Vo were different, though? Can you calculate a similar sequence of vectors starting with a different vo? 


T T 
Do the vectors of your sequence approach the direction [ —2 1 | too? Answers with vo = [ 1 1 | on page 208. 


Figure 6.2.1 nicely illustrates the convergence geometrically, but we ought be able to detect it algebraically as 


You may recognize this as Newton’s method. 
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Figure 6.2.1: Calculating v;,; = Mv; shows a sort of convergence 


70000 


30000 + -120000 -110000  -100000 -90000 -80000 -70000 -60000 -50000 -40000 -30000 -20000 -10000 0 ~ 10000 


-10000 is 
well. Let ¥; = wave i=1,2,...,10. That is, let ¥; be v; scaled by the reciprocal of its second entry. Then 
%1 =vi/(vi1 =| -1.75 1 | 


T 
2 = vo/(Va)21 =| -2.125 1 | 


2 Red 
| Il 
< < 
& Ww 
Si ~~ 
ay hie 
< < 
# w 
— — 
N Pad 
Il 20 
| | 
bo _— 
(a) \o 
ne lon 
N & 
Nn N 
\o 
— 
— 
— 
— 


< 
n 

| 

< 
an 
~~ 
~~ 

< 
an 
Ya 
n 

2 


T 
~2.00137 1 | 


7 =v7/(v7)2,1 | -1.99954 1 li 


Vg = Ve/(Vg)o1 ~ 


| 
| 
| 
| 
#5 = vs/(vs)2,1 * | -1.99590 1 J’ 
| 
[ -2.00015 1 |" 
| 


%y = vo/(vo)2,1 «| -1.99995 1 |" 


a T 
#10 = Vio/(Vio)21 ~ | -2.00002 1 | 


Being a scalar multiple, the vector ¥; points in the same direction as v;. The list of ¥; numerically demonstrates that 


ii rare : ; 
the ¥;, and therefore the v;, are pointing closer and closer to the [ —2 1 | direction. Interestingly, 


—2 7 8 —2 -6 —2 
de ed ee ee 
or M| —2 1 | =3[ —2 1 ]'-s03.| —2 1 ' is an eigenpair of M! 


Sit back and think about this for a moment. We started with a seemingly arbitrary matrix and a pretty shabby 
approximation of one of its eigenvectors. We then proceeded to multiply the (shabby) approximation by M, the 


> 
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resulting product by M again, the resulting product by M again, and so on, the final result of which was a very good 
approximation of an eigenvector of M. With a computer at hand to do the calculation, this is a whole lot easier than 
solving a characteristic equation. More importantly, though, remember general polynomials of degree five or higher 
cannot be solved exactly. This means eigenpairs for square matrices of size 5 x 5 and up cannot generally be found 
exactly (their characteristic polynomials are fifth degree or higher). Numerical methods must be used! 

The approach of iteratively multiplying some vector by the matrix whose eigenpairs are desired, known as the 
power method, seems to have potential, but the example should leave you with lots of questions. For example, 


Does this always work? 

Why does it work? 

If it doesn’t always work, when does it work? 

Are there other methods we can try? 

Can we say how many iterations are needed to get a good approximation? 
Which eigenpair is found when it works? 

Does it matter what vector is chosen for Vo? 

Will the method always produce the same eigenvector? 

What about other eigenvectors? 


The computation in crumpet 26 answers several questions, partially answers others, and leaves some open. For 
example, the method works when 


(i) M isn Xn and has n linearly independent eigenvectors, and 

(ii) one of the eigenvalues is dominant in the sense that its magnitude is larger than all the others, and 
(iii) the eigenspace of the dominant eigenvalue has dimension one, and 
(iv) Vo is chosen appropriately. 


That’s not to say it won’t work in other instances. As mathematicians say, these are sufficient conditions, not necessary 
conditions. Other answers include 


1. The method only determines the eigenvector corresponding to the dominant eigenvalue. 


k 
2. The accuracy of the approximation is proportional to the largest ratio |24| , | # d where Aq is the dominant 
eigenvalue. 


The exercises explore a number of these points, and describes a modification of the power method that can be used to 
find any eigenpair. 

On a practical note, implementation of the method will include scaling vy, with each iteration. As the example 
shows, the norm of v; can grow very large very quickly. Crumpet | reveals that v, will grow (or decay) exponentially. 
Since it is only the direction of v, that matters, scaling does not affect the success of the algorithm. Typically, vx will 
be scaled by 


max{|(vi)jil: j= 1,2,.-..n} (6.2.1) 


so that the magnitude of the greatest (or least) entry of v;z is one. 


Crumpet 26: The Power Method 


Suppose MM is a diagonalizable n x n matrix and P is an n X n matrix whose columns are linearly independent 
eigenvectors of M (see section 5.4), and let D = P~' MP. Further suppose that the eigenvalues ,, A>,..., 2, of M are 
such that for some d, Ag is the dominant eigenvalue (|A,| > |A;| for all i # j) and the eigenspace of A, has dimension 
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one. Pick an arbitrary vector vp in R" and let w = P~'vo. Finally, define v,; = Mv,_; for k > 0 and Z as then xn 
matrix with zeros everywhere except the d,d-entry where it has a one. Then for large enough k, 


Vi = M*vo = PD‘P™'vo 


= PD‘w 
aA @) oss 0) kK 
0 A 0 
=P , Ww 
0 O ae 
OMT nO 
0 ak ere 0 
=. E i Ww 
0 O ae 
ak 
a (eee 0) 
0 4 0 
a oe 
=A a WwW 
: : 
0 O “ 
d 
~ AK PZw 
=a4[ 0 + 0 Pa 0 -- O]w 
= AtwasP.a- 


But P.4 is an eigenvector of M corresponding to Az, so Vv; 1s approximately an eigenvector of M corresponding to Aq 
as long as Wa # 0. 


Key Concepts 
dominant eigenvalue an eigenvalue with larger magnitude than all other eigenvalues of a given matrix. 


power method iteration of the recurrence relation v; = Mv,_; for some initial vector Vo, usually with scaling. Under 
certain conditions the sequence Vo, vi,... will converge to an eigenvector of M. 


Exercises 


—25 129 -72 
-360 


ou] =i7 3 
9 -69 36 


(a) u=| H | (e) QESTERSD os 


1. Find the dominant eigenvalue of M. 


185 76 9 6 


-6 9 
-573 -234 -27 -18 
(b) w=| = =| -327 M=| 73 288 31 18 
417 166 15 14 
as 2 RIS ASAT 
-8 -72 72 -0 
5 1 -39 -110 88 3 
= 3 = 
ore 2 6 | a M=! (39 -126 104 3 
2 108 -104 -18 
-27 -29 39 
-360 
(e) M=]} 30 32-45 
4 4 =7 2. Does the matrix have a dominant eigenvalue? 
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“3 5 
(@) | =i. 3 | 
19 12 
(b) | Leama | [S]-327 


641 225-530 
(c) £3) Sagettath Cell BA -735 -279 630 


360 120 -396 


-35 -5 10 
() ESTED 7) ) -137 1 58 
-76 8 14 

[A]-360 


-6 -l1 -32 
(c) QEEAENSD 2} 72-68 -128 
-18 11 8 


(f) 8) Sagetath Cell am 
-645 430 —-473 129. -387 
—-556 466 —-412 -0 —144 
-511 662  -443 —-129 147 
—517 1106 209 -817 1209 
211 -134 277 -129 149 
[A]-360 


3. The sixth and seventh terms of the sequence defined by 
an initial vector, vg, and the recurrence vy. = Mvy,_1, 
where M has a dominant eigenvalue, are given. Use this 
information to estimate an eigenpair of M. 


Bh He -| —467 jive=| 1894 | 


742 ~3020 
[S]-327 
vex 9311 |, _[ 93494 
>~}! ~3061 |? "© | -30994 
4099 16381 
KC) ‘Ws -| 2049 fv -| 8191 | A 
27787 282494 
(d) vs =} 46793 |: v6 =| —468970 
-18613 187994 
128272 10654688 
(e) vs =| 177760 |; v6 =} 16132160 | [A]-360 
—52800 — 10084800 
-9793 -~1151951 
(f) v5 =} —9793 |;v6 =| —1151951 
7499 546193 


4. Calculate v, through v,, of the power method. Does it 
seem the method will converge? 


SageMath Cell _| 23 4 |. 
a) & 74M Ee a1 | 


11-8 
vw=| oe | [S]-328 


(b) .) SageMathCell 715M -| 2 1 


wl 


9) OREED iu-( 72 * | 


8 -17 


-8 
Vo -| if | [A]-360 


(d) OEE 1: 1 -| See } 


[ 


-21 20 -10 
@ ORMEED 1s =| 30 1 Ws | 
-24 40 -29 
9-8 
Vo | 9-8 [A]-360 
9-8 
11 -4 10 
) QEESETSD 3) uv = | 6 1 10 | 
—3 2 #0 


L 
5 
4 
5 
1 
5 
-63 =7 20 
(:) QEETESD 0 vw =| -16 -56 100 |; 
—5 2 -37 
0 
13° 
0 


Vo = [A]-360 


42 2 
COR) Sazettath cell sa| 4 3 -1 p> 


H 


5. Calculate ve through Vio9 for the matrix and vector vs 
of question 3, implementing the scaling procedure sug- 
gested in (6.2.1). Use this information to find an eigen- 
pair of M exactly. 


(a) £93) Sagetath Cell 82 [S]-328 
) ORI ;; 
(c) £3) Sagetath Cell 84 [A]-360 
@ OREIIED ; 
(e) $<) Sagettath Cell 86 [A]-360 
) ORSED «, 


6. The algebraic multiplicity of an eigenvalue is its multi- 


plicity as a root of the characteristic polynomial. The ge- 
ometric multiplicity of an eigenvalue is the dimension of 
its eigenspace. They are equal when the algebraic multi- 
plicity is one. The geometric multiplicity is always less 
than or equal to the algebraic multiplicity. Given are the 
matrix M, its characteristic polynomial p, and the geo- 
metric multiplicity, g, of the dominant eigenvalue. (i) 
State the dominant eigenvalue of M. (ii) State the alge- 
braic multiplicity of the dominant eigenvalue of M. (iii) 
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Apply the power method. Does it produce an (approxi- 
mate) eigenvector? (iv) Apply the power method with a 
different initial vector. Does it produce an (approximate) 
eigenvector? Is this the same result as before? 


(a) .)) Sagetath Cell 88 


-22 -48 -90 -99 
me| 732 736 47 46 |, 
-50 -40 -56 -35 P 
72 12 -63 -94 


p(a) = (A+ 76)*(A + 57\(A + 19); g =2 
(b) a) SageMathCell 89 


111 169 -39 -125 
ma| 730-47 6 22 |. 

195 310 -72 —-230 |’ 

3 -1 -6 -16 
pa) = (At 122+ 3\A-3); g= 1 
[A]-360 


. M is not diagonalizable (does not admit 4 linearly inde- 
pendent eigenvectors—see section 5.4) but has a dom- 
inant eigenvalue with only one associated eigenvector. 
Apply the power method anyway. Does it work? 


(a) i) SageMathCell 90 


-136 240 -448 352 
M= 88 -112 256 -208 
“| 224 -352 672 -512 
139 -226 408 -296 
(b) .)) Sagetath Cell 91 
14 -134 -34 59 
18 -114 -14 55 
Me) Ag. 519. 68> 24g | Pe 
48 -352 -40 172 


. The dominant eigenvalue of M has algebraic multiplic- 
ity 2 but geometric multiplicity 1. See exercise 6 for an 
explanation of algebraic and geometric multiplicities. 


1528 876 736 = ©—436 
M= -—1530 -870 -1080 360 

—431 -267 124 197 

782 618 500 76 


The entrywise quotient of M°°?7 by M®>*° is (approxi- 
mately) 


390 390 390 390 
390 390 390 390 
390 390 390 390 
390 390 390 390 


[A]-360 
(a) What does this say about the eigenvalues of M? 


(b) What does this say about the eigenvectors of M? 
HINT: think about the columns of M6, 


9. 


10. 


(c) What does this say about the power method ap- 
plied to M? 


92 The eigenvalues of M are 


—4, —20, -28, and 28 so M does not have a dominant 
eigenvalue. It has two different eigenvalues with max- 
imum magnitude. Running the power method with scal- 
ing as in (6.2.1) with the given vp anyway gives V109 as 
shown. Express V0 as a linear combination of the eigen- 

1 5 2 2 

0 3 0 
vectors 1} 

1 1 5 0 
the eigenvalues —4, —20, —28, 28 in that order) of M. 


(corresponding to 


12 17 39 -55 

0 55 45 -45 
ie 40 13 23 -67 | 

40 58 70 -114 

1 —0.477419 

1 0.483871 
Vo = > Vio0 ~ 

1 -l 

1 —0.709677 


What does this say about V;99. What does this say about 
the power method applied to M? 


Find the eigenvalues of M from question 4. Use the in- 
formation to supply an explanation of why the power 
method did/did not seem to converge. [A\|-360 


Exercises 11-14 describe the inverse power method and sup- 
ply one example. 


11. 


12. 


13. 


14. 


Show that if A, v is an eigenpair of M and M is invertible, 
then 4, v is an eigenpair of M~!. 


Show that if A, v is an eigenpair of M then A - a, v is an 
eigenpair of M —- al. 


Combine 11 and 12 to show that if A, v is an eigenpair 
of M and a is not an eigenvalue of M, then 4, vis an 
eigenpair of (M— al)". 


£3) Sage ath Cel] 93 The power method could be used 


to find approximations of the dominant eigenvalue (ap- 
proximately 77) and associated eigenvector of 


585. -53 -303 -827 

ip 221 39 -176 —263 
1652 -36 -944 —-2204 

-296 -24 192 360 


With some preparation, however, the power method can 
be used to approximate non-dominant eigenpairs too! 
One of its eigenvalues, A, is around 20. In fact 20 is 
closer to this particular eigenvalue than any of the oth- 
ers. Thus sn is the dominant eigenvalue of (M—-20/ aa 
Apply the power method to (M — 20/)! to find approxi- 
mations of A and an associated eigenvector. 
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Answers 


T 
different vo Setting vo = [ 1 1 | , for example, leads to the sequence V1, V2,..., Vi0 


[15 -9 ]',[ 33 -15 J’, [ 11 -37 ]', [321 -159 |’, | 975-489 |’, 
| 2913-1455 ig [ 8751 -4377 ie | 26241 -13119 ies 
| 78735 -39369 |’, | 236193 -118095 |". 


Again the second entries are approximately -5 times the first, and getting closer the further we go in the 


sequence. For example, ae x —0.499994. Unless you chose Vo in the direction of [ 1 -l ic you should 


have noticed the same thing for your sequence. 
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Figure 6.3.1: What is the area of the overlapping region? 


Figure 6.3.2: What are the areas of these shapes? 


6.3 Geometry: Determinants, Eigenvalues, and Area [3.3, 3.6, 4.4] 


Intuitively, we might think of area as the amount of paint needed to paint a particular shape. The more paint needed, 
the larger its area, and the larger its area, the more paint needed. To have some sense of what is meant by the area of 
an object, this intuition is good enough. Larger shapes have larger area while smaller shapes have smaller area, and 
the area of a shape is some measure of this size. 

Calculating the areas of shapes (assigning numbers to areas) is another story. We certainly are not going to require 
that to find the area of an object it needs to be painted and the amount of paint used measured. What paint should be 
used, by whom, and what instrument should do the measuring? This process would be so imprecise as to be useless, 
giving the area of a single object many numerical areas. A single shape has but a single size, however, and so it must 
have but a single measure of its size—like the area formulas presented in grammar school. The area of a shape must 
be uniquely determined. 

The area of a rectangle is its length times width. The area of a triangle is one half its base times height. The 
area of a circle is times the square of its radius. Trapezoids, parallelograms, regular polygons, and unions of such 
shapes have calculable areas. But what about more complex shapes? For example, take an arbitrary nonempty overlap 
between a square and circle where neither is the circle contained within the square nor is the square contained within 
the circle. See figure 6.3.1. Calculus provides a method for calculating its area and hints at the complexity of the 
general question. By slicing the shape into smaller and smaller approximating rectangles and adding up the areas of 
those rectangles, the area can be approximated more and more accurately. The limit of these areas as the widths of the 
approximating rectangles approach zero is the area of the overlap. If you’ve taken calculus, that probably reminds you 
of integration, and it should! If you have not taken calculus, that probably sounds rather confusing and complicated, 
and it should! That is really the point. It is not an easy matter to calculate area, even of shapes that are easy to draw. 

To stretch the point just a bit further, consider the shapes in figure 6.3.2. The figure on the left is the snail of 
Solomon Golomb[10] and features an infinitely spiraling appendage. The figure on the right is referred to as a twin 
dragon as it is the union of a pair of dragon curves. Neither of these figures can be drawn with perfect precision 
since each has infinitely small detail. The twin dragon is an example of a self-similar fractal with nonzero area. Its 
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boundary (perimeter) is infinitely long and infinitely intricate. The more one magnifies the boundary, the more detail 
is revealed. While the snail can be formed by a union of infinitely many nonoverlapping triangles in a straightforward 
way, making its area calculable, the twin dragon cannot. Even applying calculus to the problem of finding the area of 
the twin dragon is not a straightforward matter. Does it even have a calculable area? What does having a calculable 
area mean? Are there sets whose areas are not calculable? These questions can be followed deep into measure theory, 
a branch of analysis far outside the reaches of this textbook. 

With the very definition of area left as an interesting yet unresolved conundrum, 


Crumpet 27: A Definition of Area 


The area of a bounded region of the plane, a shape S, can be defined as follows. Let R be a polygonal region 
containing S$, and let Pz be a primitive partition of R (a finite set of parallelograms and triangles whose interiors do 
not overlap and whose union is R). Define the norm of a partition, denoted ||Pal|, as the maximum of the areas of the 
primitives in Pr. Then 


area(S) = li 
(S) je = area(p) 
PEPR 
pcs 


whenever such limit exists. 


it hardly makes practical sense to expect to prove the ways linear transformations affect the areas of general shapes. 
The following discussion is inherently incomplete this way. Certain claims regarding area will necessarily remain 
unproven. 


Areas and determinants 


In general, the image of a set S is defined as the set of images of all the points in S. That is, if S is a subset of A 
and T : A — B, then the image of S under T is defined by T(S) = {T(s) : s € S}. This definition is typical in all of 
mathematics, not just linear algebra, and applies no matter the sets A and B. 

To understand how the linear transformation T, : R? > R?, T4(v) = Av affects areas, it is convenient to write A 
as a product of elementary matrices, A = E,,--+ EF), as we have done before, assuming A is invertible (page 106). 
Since T,(S) = (Tz, °+++° Tg, 0 Tg, (S), if we can understand how linear transformations associated with elementary 
matrices affect area, we have a chance of understanding how general linear transformations affect area. 

If E is a row swap matrix, then Tz is a reflection about the line y = x, so in this case area(Tg(S )) = area(S). 
Reflections do not change areas. If E is a row replace matrix, then 7; is a shear transformation, and it is a known result 
of calculus that shear transformations do not affect area, so again area(T;(S)) = area(S). If E is a row scale matrix, 
then Tg; scales shapes either horizontally or vertically—not both!—by a factor of s, so area(Tg(S )) = |s| - area(S). 
In every case, area(T'¢(S )) = | det E| - area(S) (the determinant of a row swap matrix is —1, the determinant of a row 
replace matrix is 1 and the determinant of a row scale matrix with scale factor s is s). It follows that 


area (T4(S)) = area((Tg, 0 --- 0 Tz, © Te, )(S)) 


= area(Tz, (--- (Tx, (Tx,(S)))-++)) 
= | det E,|---| det >| -| det £;| - area(s ) 
= | det A| - area(S). 


If A is noninvertible, then one of the columns of A is a multiple of the other, so any linear combination of the 
columns is also a multiple of that column. Therefore, the image of any vector, which is a linear combination of the 
columns of A, is a multiple of that column. Thus the image of every vector lies on the line determined by that column, 
giving the image of any shape zero area. The entire image is contained within a line. Of course, | det A| = 0, so again 
we have area(T,4(S )) = | det A| - area(S'). 
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Areas and eigenvalues 


Let A be a 2 x 2 matrix with linearly independent eigenvectors v; and v2 corresponding to eigenvalues 2; and Az 
respectively. Then 74(v,) = A;v, and T,(V2) = AzV>2. In fact, if we let S = {v; + av. :0< a < J}, the line segment 
from vj to Vv; + V2, then T4(S) = {T4(vi + @v2):O0< a < 1} = {Ajv) + @Anv2 : 0 < a < 1} is the line segment from 
Ta(v1) to T,(v2). Further analysis of line segments shows that the image of the parallelogram determined by v,; and 
V2 is the parallelogram determined by T4(v;) and T4(v2). Can you supply this analysis? Answer on page 215. 


I'4(v>) Aov2 


Letting P be the parallelogram determined by v; and v2, we see that J, scales P in the v; direction by a factor 
of 2; and in the v2 direction by factor Az. Therefore, the area of T,4(P) equals |A,A2| times the area of P. Since 
we have been arguing that linear transformations scale the areas of all shapes the same way, we have generally that 
area(T'4(S )) = |A,A2|- area($ ) for any shape whose area is measurable. With respect to the eigenvectors of A, T', is a 
simple scaling. 

Now we have that 


area(T,4(S )) = | det A| - area(S ) 


and 
area(T'4(S )) = |A,A,| - area(S ). 


It must be, then, that | det AJ = |A; 2], a true statement about any 2 x 2 matrix! The statement can be made much 
stronger, however, as in the following theorem. 


Theorem 15. [Determinant and the Product of Eigenvalues] /f A is an n x n matrix and A), Az,...,An are its n 
(possibly complex) eigenvalues, then 


n 
det A = I] Pree oy eee ie 
i=1 


Some but not all parts of the justification of this theorem are straightforward. For example, if A is upper triangular, 
then the conclusion follows quickly. As we have seen, det A is the product of the entries on the main diagonal. That 
is, detA = []j_, Ajj. The characteristic equation 


0 = det(A — A) 
Ail -aA * some * 
0 Aro —-A ss * 
= det . . . : 
0 O GO hace 


= (Aq,1 — A(A22 — A) +++ Ann - A) 


has solutions A), A22,...,Ann, $0 the eigenvalues of A are the entries on the main diagonal of A. Hence []/_, Ai; = 
IT, 4: completing the proof for upper triangular matrices. 
If A is any matrix, the conclusion follows from two facts. 
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1. The determinant and eigenvalues of P~'AP are the same as the determinant and eigenvalues of A for any 
invertible n x n matrix P (see theorem 14). 


2. For any n X n matrix A, there is an n X n matrix P such that P~'AP is upper triangular (see crumpet 28). 


Given these two facts, if U = P-'AP, then det A = det U and the eigenvalues of A are the eigenvalues of U by fact 
1 (theorem 14). Now if P is that special matrix such that U is upper triangular, as guaranteed to exist by fact 2, then 
the determinant of U (which equals the determinant of A) and the product of the eigenvalues of U (which equals the 
product of the eigenvalues of A) are both []j_, Ui; and therefore equal. This concludes the proof of the theorem for 
general matrices. 


Crumpet 28: Triangularization 


For a square matrix M, P-'MP is a triangularization of M whenever P-'! MP is upper triangular. We wish to show 
that there is a triangularization of any n Xn matrix. Triangularization of a 1 x 1 matrix is simple enough since all 1 x 1 
matrices are upper triangular. Choose, P=} 1 | for example. Proceeding by induction, assume a triangularization 
exists for every (k — 1) x (k — 1) matrix for some k > 2, and let M be a particular but arbitrary k x k matrix. Take any 
eigenpair A, v of M and find vectors Wu), U2,..., Uy»; such that {v, u), U2,...,U,-1} is linearly independent. This set can 
always be found since v must have at least one nonzero entry (0 is not a permissible eigenvector). Assuming v; # 0, 
we may take {v,u),Uo,...,U,-1} = {v} U{/.; : j # i}. Setting O = [ vu Ww: U1 Ih Q is invertible (its 
columns are linearly independent), and 


Q'MQ=Q'M|v wow - ui | 
=| O'My O-'Mu, O-'Mw --- O-'Mu,. | 
= || AQ"'v O-'Mu, O-'Mw --- O-'Mu,1 lk 


While we cannot say much about Q-'Mu; for any j, we can say AQ'y = Al, because O'Q = 
Q! [ vu Ww: Wy | = I. Q™! times the first column of Q must be the first column of J. Hence we 


have 
Aik wk kok 


* Ok 
o'mg=|9 * * * * 


ro) 
+ 
+ 


Dok kok Ok 
O « &* k * 


By the inductive hypothesis, there is a triangularization of (Q7!'MQ)\1,1. Let R be such that R'(Q7!MQ)\,1R is upper 


triangular, and set 6) = | ; ; | Then (om = ; he | and 


ee eee a 1 0 
O'(0 'MQ0=| R lle (2'MQ)i1 ll | 


is upper triangular. Hence (Q0)-!'M(QQ) is a triangularization of M and we set P = QO. This result suffices 
for our purposes, but the result can be strengthened to specify that QO have a certain property, a so-called Schur 
decomposition. 


Hence we have two ways to measure the effect of a linear transformation on the plane. In rough terms, a linear 
transformation expands or compresses areas by a factor equal to the absolute value of the determinant, which is equal 
to the absolute value of the product of the eigenvalues, of its standard matrix. More precisely a linear transformation 
expands or compresses areas in the direction of each eigenvector by a factor equal to the absolute value of the 
associated eigenvalue. 
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Determinants, eigenvalues, and volumes 


The analysis of elementary 3 x 3 matrices follows much along the same lines as the analysis of 2 x 2 matrices in 
section 4.4. Vectors in R? can be imagined as arrows or points just as they are in R*. Images of cubes in R? under 
transformations associated with elementary matrices analogous to the images of the coffee cup in R? can be derived. 
They will also be a collection of reflections, shears, and scalings. Rotation in R* can be accomplished by a compostion 
of scalings and shears just as in R*. Noninvertible 3 x 3 matrices can be described by compositions of elementary 
matrices and projections as well. Hence theorem 11 can be proved for linear operators on R?. Generally, if the 
2’s of the present section are replaced by 3’s and the word area is replaced by the word volume, the discourse still 
applies with only minor additional modification. To illustrate, for 3 x 3 matrices M with eigenvalues 2), A, A3, and 
three-dimensional regions of space, R, 


volume(Ty(R)) = | det M| - volume(R) 
= [Aj AzA3| - volume(R) 


and the concluding paragraph in the discussion of transformations of the plane might be rephrased for transformations 
of space as follows. 


We have two ways to measure the effect of a linear transformation on space, R>. In rough terms, 
a linear transformation expands or compresses volumes by a factor equal to the absolute value of the 
determinant ,which is equal to the absolute value of the product of the eigenvalues, of its standard ma- 
trix. More precisely a linear transformation expands or compresses volumes in the direction of each 
eigenvector by a factor equal to the absolute value of the associated eigenvalue. 


Crumpet 29: Hyperspace 


The main results of this section and the previous are stated and hold for R”, giving an enterprising individual a basis 
to extend the ideas of area and volume to dimensions higher than 3! The notion of a hypercube (in hyperspace) is 
exactly this enterprise. 


Affine Transformations 


Translations, transformations of the form T : R” > R”, 
T(x) =x+e, 


are not linear for any c # 0. Can you provide a justification? Answer on page 6.3. But because their geometric 
effect is to simply displace all points by the same distance and direction, they do not change the shapes of figures and 
therefore do not change areas or volumes of figures. For a translation T, area(T(S)) = area(S) for any set S with 
measurable area. 

Affine transformations, compositions of linear transformations with translations, are consequently not linear ei- 
ther, but their effect on areas is predictable. They scale areas in exactly the same manner as their linear parts. For an 
affine transformation F : R” > R", 

F(x) = Ax+ce 


for some matrix A and vector ¢ and area(F(S )) = |det A] - area(S ) for any set S with measurable area. 


Key Concepts 


set image For any transformation (map or function) f : A > B and subset S of A, 


F(S) = {f(s) is € S} 
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determinant and area For any linear transformation T, : R* — R? and any subset § of R* with measurable area, 


area(T,4(S )) = |det A] - area(S ) 


determinant and volume For any linear transformation T, : R? — R°? and any subset S of R? with measurable 
volume, r 
volume(T,4(S )) = |det A] - volume(S ) 


determinant and eigenvalues The determinant of any square matrix is the product of its eigenvalues. 
triangularization For any square matrix M there is an invertible matrix P such that P~' MP is upper triangular. 


affine transformation The composition of a linear transformation with a translation. 


Exercises (c) 5 


1. Find the area of the parallelogram with vertices 
(a) (0,0), (2,3), (5,—-1), (3, -4) [S]-328 
(b) (0,0), (1,8), (1,5), (-2, -3) 
(c) (0,0), (—5, 6), (7, 18), (12, 12) 
(d) (4,5), (8, 11), (16, 12), (12,6) [S]-329 
(e) (-1,3), 3, -1), 9, -4), 6,9) . 
(f) (4,-2), (11, -5), (9, -10), (2, -7) 


2. Use the fact that area(T4(S)) = |det A| - area(S ) to justify 
the claim that the area of the parallelogram determined 


by the columns of a 2 x 2 matrix A is |det AJ. Alterna- (d) (u1, ue) 
tively, justify the claim that the area of the parallelogram 
determined by two vectors (anchored at the origin) is the 
absolute value of the determinant of the matrix whose 
columns are the two vectors. 
3. Calculate the area of the triangle as half of a determinant. 
See exercise 2 for a hint. 
(a) § (v1, v2) 
5 
4 
3 4. The image of the hexagon with adjacent vertices (0,0), 
(2,0), (2, 1), (1, 1), C1, 2), (0, 2) under the transformation 
: T(x) = Ax where A is a 2 X 2 matrix is shown. What is 
1 the absolute value of the determinant of A? 
[A]-361 
1 0 1 2 38 4 5 6 7 8 9 
“4 
5 
(b) 
4 
3 
2 
5-4 -3l5 -B 215 -2 -1|5 -1 -0l5 
2 4+ 0 elrsuk) 5 6 7 8 05 
A 
f 5. What are the eigenvalues of the matrix A of question 4 


[S]-329 assuming no reflection? [A]-361 
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6. Suppose M factors as P-'UP. Use the information to 
find (i) det M and (ii) the eigenvalues of M. 


-4 -9 12-10 
Bh 30 s fP=| to a | 
[S]-329 
6 12 -12 10 
mul y Sheela 
= 
(@U=| 0 oO -12}5 
0 0 12 | 
4 9 W 
p=[ 31 “| 
-7 -3 6 
1 4 4 
@u=| 0 -5 -2 | 
0 0 3 
i 3 <12 
P=|-6 -10 1 | 
6 it: A 


7. Use SageMath to verify your answers in question 6 by 
calculating M and having SageMath compute the deter- 
minant and eigenvalues. 


(a) £3) SageMath Cell Fa [S]-329 
(b) 5) Sagelath Cell Ja 
(c) .)) Sagetath Cell Jaya 
(a) .)) SageMath Cell ayy 


8. Use the fact that volume(7T,(S)) = |det A] - volume(S) 
to justify the claim that the volume of the parallelepiped 
determined by the columns of a3 x 3 matrix A is det A. 

9. Let S be the unit square with opposite corners (0,0) and 


(1,1). Sketch the image of S under the affine transfor- 
mation T(x) = Ax +c. 


fe G7... Page 
@ a-| 0 io Fe=| 0 | 


0 lea 


) A=] ip “aa 12 | ears6t 


Answers 


of @ ae. a 
@A=| Sn 0 be=| sie | 


10. Let S be the triangle with vertices (0,0), (1,1), and 
(—1, 1) and suppose F and G are affine transformations 
such that F(S}) is the triangle with vertices (0,0), (0, 1), 
and (—1, 1); and G(S) is the triangle with vertices (0, 0), 
(1, 1), and (0, 1). Draw S$, F(S), and G(S) and use your 
sketch to determine the determinants of the linear parts 
of F and G? 


11. Crumpet 28 outlines a recursive procedure for triangular- 
izing any matrix. It is constructive, giving an algorithm 
for finding the traignularizing matrix P. Use it to find P 
such that P-' MP is upper triangular. One eigenvalue of 


M is given. 

13. -8 

(a) u-| ae aes [S]-330 
22 «-7 

) M=| 35 er a=-6 

(c) u-| oe ass [A]-361 
2 1 
-9 7 28 

(d) M=| -11 -27  -28 |;a=12 
8 8 12 
-34 -50 -24 

(ec) M=| 37 53 24 |;A=4 [A]-361 
-26 -35 -14 
78 -50 —54 

(f) M=| -167 135 186 |;a=5 
201 -150 -193 

[A]-330 


12. Prove that any square matrix M can be factored as 
PUP7' for some invertible matrix P and upper triangular 
matrix U. 


13. Suppose M is invertible. What can you say about the 
eigenvalues of M’M? HINT: see exercise 2 of section 
Sad. 


further analysis The parallelogram determined by v, and vp is the set S = {8v; + av2 : 0 < a, < 1} so its image is 


Ta(S) = Ta (BV, + av. :0< 0,8 < 1}) 
= {T4(Bv1 + av2):0< a,B < 1} 
= {BT,4(v1) + a@T,4(v2): 00,8 < 1} 
= {BA\V1 + @dAnv2 : 0 < a, B < 1} 


which is the paralelogram determined by T4(v;) and T,4(v2). 


translations are not linear On the one hand, 


T(xty)=x+yte 
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and on the other hand, 


T(x) + T(y) = (xk +0)+(y+e) 
=x+y+2c 


so T(x + y) # T(x) + T(y) whenever c + 0. 
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6.4 Approximation [4.1, 4.6, 5.1, 5.3] 


From the very beginning of our discussion of linear systems, we acknowledged that there were systems with no 
solution (see section 2.1 exercise 4). This was a familiar state of affairs as you undoubtedly have seen equations like 
x7 4+1=0,sin@ = 2, and xs = 3, all of which have “no solution”. Full disclosure, if your instructor or textbook 
claimed equations such as these had no solution, what they meant was no real number solution. All three equations 
have complex number solutions. v-1, sin”! (2), and In (-3) are perfectly well defined complex numbers, and are, 
respectively, solutions of the three equations. It’s possible you studied complex numbers enough to know this already, 
but it’s also possible this comes as a revelation. No worries either way. 

Linear systems with no solution are different. When we say they have no solution, they have no integer solution, 
no rational number solution, no real number solution, and no complex number solution. They simply have no solution. 
What more is there to say? 

The linear equation 

54x + 30y = 17 (6.4.1) 


has no integer solution. This can be seen by factoring a 6 from the left-hand side: 
6(9x + Sy) = 17 


showing that the left side is, for any integers x and y, a multiple of 6 while the right side is not. The best we can hope 
for are integers x and y that make 54x + 30y close to 17. To say it another way, we can look for integers x and y so 
that 

\(54x + 30y) — 17| 


(the distance between 54x + 30y and 17) is small. In fact, if we could find a minimum of this quantity, that would 
mean something. Among all the pairs of integers x and y, this pair (or these pairs) make 54x + 30y as close to 17 as 
possible. Can you find the minimum possible value of |(54x + 30y) — 17| for integers x and y? Answer on page 224. 

54(—34) + 30(78) = 18, so (%, 9) = (—34, 78) is a best approximation of an integer solution of (6.4.1). The pair 
(—34, 78) does not solve (6.4.1), but it makes its two sides as close as possible using integers. That is, 


(54% + 309) — 17| < |(S4x + 30y) — 17| 


for all integer pairs (x, y). Even when an equation has no solution, it may have a best approximation. 
Using this discussion as a model for linear systems with no solution, we ask whether inconsistent systems 


Mv=b 
have a best approximation. That is, can we find ¥ such that 
[Mv — bl| < ||Mv — bl| 


for all v? For example, 


1 -3 8 
4 9 I[:]- 5 (6.4.2) 
9 6 {I> 7 
8 
is inconsistent. Can you show this? Answer on page 224. To say it another way, | 5 | is not in the column space of 
7 
1 -3 1 -3 
| 4 9 hinting at how to find a best approximation—look inside the column space of | 4 9 | for a vector 
-9 6 -9 6 
8 
that is as close to | 5 | as possible. 
7 


4 | is not in the column space of M = | since 


b is not a multiple (linear combination) of M.. Nonetheless there is a multiple (linear combination) of M. that is 
closest to b. Geometrically, this means there is a point on the line determined by M., closest to b. This situation is 
diagrammed here. 


Actually, we have done this to some extent already! b = 
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We know the shortest distance between a point and a line is measured perpendicularly. The point on the line where 
this shortest distance occurs coincides exactly with the orthogonal projection of b onto M.;, as diagrammed here. 


6 


Helpful to this discussion is to see orthogonal projection as projecting a vector onto a subspace (rather than a vector). 
The line determined by M. is a subspace of R? as it is the span of M.). 

In three dimensions, there is a point in a plane nearest any point/vector not in that plane. That point occurs exactly 
at the projection of the vector onto the plane (and that plane is a subspace of R*). Again, projection is best viewed as 
projecting a vector onto a subspace. 

With this in mind, we have to address the questions of (i) how to project a vector onto a subspace of dimension 
greater than one and (ii) whether that projection is always the nearest point/vector within the subspace. A lot of this 
work has already been done, but there is a bit more to do now. Question 22 of section 5.3 provides a good backdrop 
for this conversation. 

First, let B = {b,,b2,...,b,} be an orthogonal basis for a subspace W of an inner product space V. Then the 
orthogonal projection of any v in V onto W, denoted proj,v, is defined by 


projyv = projy,V + projy,V feet Proj, V- 


Because each projection is a multiple of one of the basis elements, this is a linear combination of the basis elements 
and therefore lies in W. Next, we will need some terminology. 

If W is a subspace of an inner product space V and v is orthogonal to every vector in W, then we say v is 
orthogonal to W. The set of all vectors in V orthogonal to W is called the orthogonal complement of W and is 
denoted W* (read “W perp’). Can you show that W+ is a subspace of V? Answer on page 224. 

Just as v and v — proj,,v are orthogonal for any vectors v and w of an inner product space V (see section 5.3), we 
can now show that v and v — projyv are orthogonal for any vector v and subspace W of an inner product space V. 
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Letting {b;, bz,...,b,} be an orthogonal basis of W, 


(Vv — projyv, bj) = (v - (proj, v + proj, V+--+ + projy, ¥) P bj) 


= (v, bj) - (projy, V, bj) - (projy, ¥. bj) Sree (projy, v. bj) 


) 
(v, bj) 
es (v, bj) = (ep. bi) 
) 


(v, bj) 
= (v,b;) - b;,b 
J (bj, bj) ( J i) 
=0 
for each j = 1,2,...,p, SO V — projyv is orthogonal to every element of a basis of W. This is enough to show that 


V — projyV is orthogonal to every vector in W and therefore v— projy,v is in W~. Can you provide the details? Answer 
on page 224. This leads to the following theorem. 


Theorem 16. [Orthogonal Decomposition] Let W be a subspace of an inner product space V. Each v € V can be 
written uniquely as a sum 
v=wtw 


where w € W and w~ € W-. 
Existence: we have just shown that v — projyv is in W-. Since projyv is in W and 
V = projwv + (Vv — projyv) 


we have existence. 

Uniqueness: suppose v = W+W+ for some (possibly other) W in W and W* in W+. Then, of course, w+ w- = W+W+ 
so W-— W = W- —- w’. Noting that w — W is in W and W* — w* is in W", we have (w — W, W+ — w*) = 0. Setting 
X=Ww-W=wW--w-, this means 


(x, x) = (w— W, W — w~) = 0, 
so x = 0 and therefore w = W and Wt = w!. 


Corollary 17. v = projyv + (Vv — projwv) is the unique decomposition of Vv into the sum of two vectors, one in W and 
one in W. 


Finally, we are ready to answer our original question. In the form of a theorem, we have the following. 


Theorem 18. [Best Approximation] /f W is a subspace of an inner product space V and v is in V, then w = projywv 
is the closest vector to v in W. 


Justification of theorem 18 relies on a generalization of the Pythagorean theorem. Can you prove that if u and v 
are orthogonal (vectors of an inner product space), then ||u + vil- = \|ul|? + iIv||72 Answer on page 225. Now let W be 
any vector in W, W # w. Since v — wis in W* and w — Wis in W, they are orthogonal, and the Pythagorean applies. 
But (v-— w) + (W- W) = V-— Wso 

llv — WIP? = [lv — wIP + Ilw - WIP. 


Since W # w, ||[w — W||? > 0 and therfore ||v — W||? > |lv — w\[*. In other words, w is the closest point to v in W. 


Corollary 19. [Best Approximation for a Linear System] Given any m x n matrix M and vector b in R", let W be 
the column space of M. Then any solution of M¥ = projyb for ¥ is a best approximation to a solution of Mv = b. 


Can you use theorem 18 to prove theorem 19? Answer on page 225. Finally, we can return to (6.4.2) and provide 
an answer. We need to project 
8 
b=] 5 
7 
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onto W, the column space of 


1 -3 
M=| 4 9 |. 
-9 6 


This requires an orthogonal basis of the column space of M. Using Gram-Schmidt orthogonalization, let 


—3 
wi = M.2 = 9 
6 
and 
: 3 
M.,, 
oe ie-ge ee te ee 
-9 (Wi,Wi) | 6 
1 
= ; 11 |. 
-16 
Taking (scalar multiples of w; and w2) 
{b;,b.}=4] 3 |} 11 
2 -16 


Hse 


| 


as the orthogonal basis of the column space of M, the projection of b onto W is 


; . . (b, b1) {b, bz) 
rOjy,b = proj, b + proj, b = + 
Pele = ceeaie Rete = ibeabaiye ayia 

44 
3 7 1 
es ie eee a, aa 
ia vase | ee | 
Hence 
44 -1.63 
=| 33 |x| 307 
27) 137 5.07 


is the closest vector to 


in the column space of M. The distance between these two vectors happens to be 


26 
Sa V38 ~ 5.936 


and there is no vector in the column space of M closer to b. Hence the solution of M¥ = projyb, 


-44 
83 
137 


1 


27 


| 


See figure 6.4.1. 


gives the best approximation of a solution: 


1 -3 -# 

4 9 83" = 
fea 

“<2 6 oF 


1 [ -4 8 , | -44 8 | -22 
—} g3 |-1 5 ll=—/l 83 |-| 5 I]=—ll) 78 
27) 137 7 271 137 7 2711 130 


1 -3 
4 9 |¥= 
-9 6 
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1 -3 8 
Figure 6.4.1: Best Approximation of a Solution of} 4 9 | . =| 5 
-9 6 


Key Concepts 
best approximation of Mv = b is a vector ¥ such that 
|M¥ — bl| < ||Mv — bil 
for all v # ¥. (M is an m X n matrix, v, V are in R” and b is in R”.) 
best approximation theorem see theorem 18 and corollary 19. 


orthogonal a vector v is orthogonal to a subspace W if v is orthogonal to every vector in W. v is orthogonal to W if 
and only if v is orthogonal to every vector in a basis of W. 


orthogonal projection of a vector v onto a subspace W, denoted projyv, is defined by 
ProjywV = proj, V + projy,v +--+ + Proj, V 
for any orthogonal basis {b;, b2,...,b,} of W. 


orthogonal complement of a subspace W is the set of all vectors orthogonal to W, denoted W~. For any vector v, 
V— projyv is in Wt. 


orthogonal decomposition writing v as a sum w + w+ where w is in W and w° is in W~. See theorem 16. 


Exercises (a) v=| ia fo -5) 


1. What multiple of v lands closest to the point (is a best 7 
approximation)? (b) v= -~6 PP (12, 1) [S}-331 
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(c) V= 


fas 


(d) v= | (0, -7, 12) [A]-361 


(e) V= 


-11 


2 


-1l 


(g) v= ; (0, -9, 7, —4) 


=3 


(h) v= ; (-10, 6,3, -4) [A]-361 


ia 
(f) | 3 peso [A]-361 


@ v=| “¢ |:(-1,12,-8,-2) 


2. Use orthogonal projection to find the distance between 
the point and the line @. 


(a) (—10, 12); &(x) =-Hx 
(b) (-3, -4); €(x) = 4x [S]-331 
(c) (-8, 9); (x) = 8x 


@1n.e={r| a |sring} [A]-361 


G,-19;.¢= {| _ | rin} 


-3 
(f) (6,-7,5); €= | -11 |:rina} 


(d 


YS 


(e 


VY 


0 
[A]-361 
5 
(g) (-4,1,12);€=4r} 0 |: rinR 
9 
-4 
-l1|.. 
(h) (-2,0,2,-D;€=4r} 9 |irinR 
0 
[A]-361 
5 
@ ©,11,5,-8); € = 4r ee :rinR 
2 


3. Find the orthogonal projection of v onto span. 8 is an 
orthogonal set. 


(a) =| e: je-{| 2 |} [A]-361 


woofs feels) 


-4 4 
(©) v=| -6 |;B=4] 0 |} [A}361 

9 -10 

12 ay 6 
@) v=| 5 |;B8=4] 10 |} -4 

=10 2 =f 

9 ote one 
(e) v=| -12 |;B= -4 },; 2 |,) 1 

5 5 I I 

[S]-331 

iW 7 

a9 8 
Ove) 4 PRR 1 10 

12 ll 

ll 12 6 

0 1 =i 
(Y=) 19 PBR4l 4 10 

5 3 = 

[A]-361 

=, a2 | 42 | 26 

-8 2 eee 
VA) 3 PS=)) 29 | 5s Pl ar 

8 4 2 43 


4. Find the orthogonal projection of v onto spanS. S is not 
an orthogonal set. 


@v=[ &hs={ 4]? ]} 


7 
(b) v=| -7 |; 
9 
0 6 6 
§ si) =o |.) =16 |.) <7 |) popes 
-6 2 8 
8 
(c) v=] 9 
-9 
<9] [11 q 
fod <5 |) <10- |.) 3 
12 -9 ll 
-8 9 -12 
=i 5 ~2 
wy) g |e =10 |*} 12 
5 10 =9 
[A]-361 
4 
(e) V= : 
-8 
x5 4 a4 
1 12 9 
ieee | 7 -6 
5 re) 7 
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: (:) ORES 10: 
XxX 


®v=| 6 | - 8y + 3z = 10 
-llx + Ty - 6z = 4 [A]-361 
| -9x + 3y - 42 = 9 
-5 1 -9 10 
9 4 4 2 8. Find a basis for W+. 
59) 10 |} |] aoe |*) 12 10 
-11 8 -7 -11 (a) W = span 5 
[A]-361 4 
5. Isvin W+? q 
1 4 (b) W = span 11 [S]-333 
— ~ = = = 12 
w[ 3 by-oa( af 
-1 9 
0 a) 8 (c) W = span -5 |,) -8 
(b) v=} —8 |; W= span 0 |,| -3 3 -] 
6 0 —4 
[S]-332 5 
rr 5 3 (d) W=span4| —3 |,] —-l1 [A]-361 
7 7 -9 5 
(c) v=] 11 |; W=span 11 |, 2 
=] 61 -11 9. Argue that for any subspace W of an inner product space 
4 é V, dim W + dim W+ = dim V. 
(@) v=| 7 Wwe 1 4 10. Argue that for any subspace W of an inner product space 
2 . . _4 : 6 V, if v is in W, then projyv = v. [A]-361 
[A]-361 11. Argue that for any subspace W of an inner product space 


V, the orthogonal decomposition of 0 is 0 + 0. 
6. For v and W in question 5, find the orthogonal decompo- 


sition of v relative to W. [S]-333 [A]-361 12. Let W be a subspace of an inner product space V, and let 


B be a basis of W. Argue that v is orthogonal to each 
7. Solve the system. If it is inconsistent, find a best approx- vector in 8 if and only if v is in Wt. [A]-361 


imation to a solution. 5 . ie . 
The remainder of the exercises are set within the inner product 


(a) @ SageMath Cell aye space P3(R) with inner product 
-6 + 3y - 152 = 7 
ie a 6 oe SS ee (P.4) = P(-1g(-1) + pO)q(0) + pA)q(1) + p(2)q(2). 
Ae PY ae eG You may find the SageCell at £3) Sage ath Cell] 105 helpful. It 
(b) @ SageMath Cell Fy computes this inner product for arbitrary third degree polyno- 
l4x + 4y + 22 = 9 tals 
3x - y + 6 = 1 13. What multiple of q lands closest to p (is a best approxi- 
-5x -— S5y + 102 = 8 mation)? 
a) p=7x° -10x-5 
) OEE 100 ce a a 
Tx + Illy + 12z = -6 : 
5x -— 8y -— 42 = -1 [A}-361 (b) p=—10x° +5x+4 
4y - Tz = -2 q = 3x°+ 8x? -x-7 
(c) p =x -11x?-9x4+10 
(d) £3) Sage ath Cell] 101 q = 12x37 -x-5 [S]-334 
-4x + 2y - 102 = -Il1 
8x - 4y + 202 = 6 14. How far off is the best approximation in question 
i = y Se = 13? [S]-334 
15. Find the orthogonal projection of p onto span8. 8 is an 
(e) £3) SageMath Cell] 102 orthogonal set. 
—-3x - 6y + 3z = -12 F 5 
Sn = Oy = 192 = 12 (a) p=x?+3x°-6x4+4 
ox - lly - 32% = -37 B = {-6x? + 2x- 1} [A]-361 
[A]-361 (b) p= 8x? -x-12 
-[2 2 
¢) OES 10: ie hala 
5x - 3y + Z = —4 (c) p = 2x7 -9x° +9x-6 
=e & Gy = Bere § B = {11x + 3x- 1,-4x9 + 8x7 + 17x — 28} 


20x -— 12y + 4, = -16 [A]-361 
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(d) p= 113 +9x- 10 
B= {x3 - 9x - x-2, 
4x3 — 12x27 4+ 3x4 47 
(e) p = -2x° —5x° +5x-11 
B= 5x? - 11x + 14,4x? - 12x-5} 
[A]-361 


16. Find the orthogonal projection of p onto span8. 8 is not 


17. Is pin W+? 


(a) p=5x°-2x°+3x4+2 
W = span {7x° + 8x2 — 8x- 1} [A]-362 


(b) p= 5x3 - 8x2 - 11x +6 
W = span {5x° + 3x? + 36, 4x? - x +2] 


(c) p=x°4+ 10x? -2x-4 


an orthogonal set. W = span {-423 + 3x? -2x- 6,329 + 53°} 


(a) p=2x4+x7-x-3 
B = {8x9 + 11x? - 6x, 9x3 - 7x - 9} 
(b) p=x—5x°4+12x-7 
B = {-3x? + 3x- 1-79 + 3x4 11} 
[A]-361 
(c) p=—10x3 + 10x? + 11x +12 
B= {9x3 — 9x, -6x° -— 1} 


18. For p and W in question 17, find the orthogonal decom- 
position of p relative to W. [A]-362 


19. Find a basis for W-. 
(a) W= span {1, x, 2? = 2 


(b) W= span {x +1, -2x- 2} [A]-362 


Answers 


minimum integer solution We know that |(54x + 30y) — 17| cannot equal zero, so the best we can hope for is to 
find integers x and y so that |(54x + 30y) — 17| = 1. That is, (54x + 30y) — 17 = 1 or (54x + 30y) - 17 = -1. 
Adding 17 to both sides of these equations, we seek integer solutions of 54x + 30y = 18 or 54x + 30y = 16. 
Since 16 is not a multiple of 6, there is no hope of finding integer solutions of 54x + 30y = 16. Since 18 is a 
multiple of 6, perhaps there are integer solutions of 54x + 30y = 18. Dividing both sides of the equation by 
6, 9x + Sy = 3 or 9x = 3 — Sy. As long as 3 — 5y is a multiple of 9, we will have a solution. For example, 
y = —3 makes 3 — Sy = 18 (and x = 2 makes 9x = 18), so one solution is (x,y) = (2,—-3). Sure enough, 
\(54-2 + 30- -—3) — 17| = |108 — 90 — 17| = 1. There are others. 


inconsistent system The augmented matrix for the system reduces as follows. 


1 -3 8 1 -3 8 1 -3 8 
4 9 5SJ>]}] 0 21 -27 |>}] 0 21° -27 
-9 6 7 0 -21 79 0 O 52 


An echelon form has a pivot in the rightmost column (the third row represents the equation 0 = 52), so the 
system is inconsistent. 


W- is asubspace Let u and v be in W* and c be scalar. (By definition, (u, w) = (v, w) = 0 for all win W). Then for 


any w in W, 

(0, w) = (Ow, w) = O(w, w) = 0 
and 

(u + V, W) = (u, W) + (v, w) = 0 
and 


(cu, W) = c(u,w) =c-0=0 
so 0, u + v, and cu are all in W~. Since W* is a subset of V, this is sufficient to show that W~ is a subspace. 


V — projyv is in W*+ Suppose v is orthogonal to every element of any basis 8B = {bi,bo,... 
Then for any scalars c),c2,...,€p, 


,b,} of a subspace W. 


(v, cb, + cob2 +--+ + cpbp) = (V, cy b, + cob2 +--+ + cpby) 
= (Vv, c1b1) + (Vv, c2b2) ate eu (V, Cpbp) 
= €1(V, bi) + C2(v, bz) +--+ + CrV, b,) 
=0. 
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Since every vector in W has the form c;b, + cob) +--+ +c,b 
and therefore is in W+. 


p» this shows vy is orthogonal to every vector in W 


REMARK: Note that if v is in W~, then v is orthogonal to every element of any basis 8 (since v is orthogonal 
to every vector in W—including basis vectors). Altogether we have that v is in W* if and only if v is orthogonal 
to every element of a basis of W. 


Pythagorean theorem Because u and v are orthogonal, (u, v) = 0 and therefore 
\Ju + vil? = (a+ v,u+ v) = (u,u) + 2(u, v) + (v, v) = (uu) + Cv, v) 


2 2 
= |lull” + IIvil" . 


theorem 18 implies corollary 19 Corollary 19 is the special case of theorem 18 where W is the column space of M, 
V=R"” andv=b. 
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Further Applications 


7.1 Linear Regression [6.4] 


Perhaps the most ubiquitous application of linear algebra outside the boundaries of mathematics is linear regression, 
used to test hypotheses and produce models of phenomena in innumerable fields including meteorology, criminology, 
economics, materials science, archaeology, enginering, and psychology [21, 8, 16, 6, 1, 23, 3]. Anywhere two or more 
quantities are suspected of correlation, regression analysis can be performed. In its simplest form, two quantities are 
suspected of having a linear relationship. Data are collected on the two quantities, and a model (linear function) 
predicting one quantity based on the other is produced and analyzed. 

For example, it is well known that the distance a gas or diesel powered vehicle is driven (in miles, for example) is 
more or less directly proportional to the volume of fuel (in gallons, for example) consumed. Also understood is that 
highway driving generally uses less fuel per mile than city driving. This is why statistics on new cars will include 
both a highway and a city mileage estimate. The graphs in figure 7.1.1 were produced from the February driving data 
for a 2010 VW Jetta Sportwagen TDI in the chart below. Only February data are considered because it is also known 
that ambient temperature affects a combustion engine’s efficiency. This car was driven in New England, where the 
average temperature in February is around 30°F, 0°C. 

The graphs confirm the claims that driving longer distances requires more fuel (left graph) and that highway 
driving uses fuel more efficiently than city driving (as average speed increases so does mileage, right graph). When 
trends like these are observed, linear regression provides a way to quantify the relationship between the variables in 
the form of a function. This function can then be used to predict one quantity from the other. 


Fill-up Elapsed Average ‘Price per 
Date Miles | Gallons Speed Gallon 
02/08/12 450 13.25 21.739 4.36 
02/23/12 685 18.101 29.783 4.40 
02/17/13 394 12.956 18.098 4.36 
02/01/14 445 12.014 29.568 4.38 
02/16/14 432 12.696 28.571 4.60 
02/26/14 529 13.861 27.696 4.46 
02/06/15 453 13.233 24.486 3.25 
02/22/15 357 10.142 34.932 3.00 
02/12/16 442 12.11 27.625 2.30 
02/16/17 455 13.971 26.045 2.68 
02/02/18 446 13.343 27.328 3.10 
02/20/18 441 13.003 27.947 3.15 


Let’s say the owner of this vehicle is planning a trip from New Haven, CT to Augusta, ME (approximately 600- 
miles round trip) next February and is interested in how much fuel will be used. Perhaps the simplest way to estimate 
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Figure 7.1.1: Graphs of Diesel Data 


Diesel Consumption in February Mileage vs Average Speed 
20 39 
38 = 6 
18 o 37 = 
o 
te 36 
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5 14 a mt < 34 a 7 = 
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5 a * § 33 = 
12 = = 2 
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is to sum the elapsed miles, sum the gallons, and divide. This gives an average of about 0.0286996 gallons per mile. 
A 600-mile trip at this rate of consumption would require 0.0286996 - 600 = 17.21976 gallons of diesel. 

For this application, that is probably good enough. However, we can do slightly better using linear regression. 
We know that fuel consumed is (roughly) directly proportional to miles driven, so they are related by a function of 
the form y = kx. Either variable can represent either quantity, but since we are interested in predicting fuel required 
given distance driven, we are looking for a function of the form f(x) = kx where x = distance driven and f(x) is the 
fuel required. The simple average calculated above produces the model f(x) = 0.0286996x, but is this the best value 
for k? 

It depends on how you define “best value”, but one reasonable definition is to minimize the sum of the squared 
errors, where an error, also known as a residual, is the difference between an observed response and the modeled 
response to the same input. For example, 12.956 is the observed response to the input 394 (observed on 02/17/13). The 
modeled response, using f(x) = 0.0286996x, is f(394) = 0.0286996 - 394 = 11.308 gallons, however. The error, or 
residual, for this observation is therefore 11.308 — 12.956 or —1.648 gallons. The squared error is (—1.648)* ~ 2.716. 
A similar squared error can be calculated for each observation. The sum of the squared errors is, accurate to 5 decimal 
places, 9.30835. 

As a linear algebra problem, finding the best value of k in this sense amounts to finding the best approximation 
(see corollary 19) of Mv = b where 


450 13.25 


685 18.101 
m=| 3% |, va[&], p=| 12956 


441 13.003 


since 
450k — 13.25 
685k — 18.101 


Mv —bl| = || 394K - 12.956 


441k — 13.003 
(450k — 13.25)? +--+ + (441k — 13.003), 


the square root of the sum of the squared errors. And 19 tells us the best approximation is the projection of b onto the 
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column space of M. Letting W be the column space of M, 

— beMi _ 74639.689 
~ M.i-M., "2619895 
= 0.0284896M. ; 


projyb = projy, ,b (7.1.1) 


so the best value of k is 0.0284896 (giving about 17.09 gallons for a 600-mile trip). The sum of the squared errors for 


2 
the model f(x) = 0.0284896x is Ifa 0.0284896 | 7 b|| ~ 9.19278—slightly lower than the 9.30835 we got from 
the model f(x) = 0.0286996.x. Plotting the model on the same axes as the data illustrates the closeness of fit. 
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The driving data provide other opportunities for linear regression. Plotting the price of diesel over time produces 
the following graph. 
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This graph shows an overall downward trend in price over the six year span of the data. Linear regression can be used 
to capture this overall trend. If we are interested in an average decresase in price over this time span, we could find 
the best fit model of the form p(t) = po + rt, a linear model whose slope estimates the average annual drop in price. 

As a linear algebra problem, we wish to find the best approximation to Mv = b where M holds the inputs, b holds 
the responses, and v holds the unknown parameters po and r. In this case, the input variable is time, which we will 
measure in days since | February 2012: 


1 7 4.36 
1 22 4.40 
ire} 1 382 | Ge oh b=| 436 |. 


1 2211 3.15 


230 CHAPTER 7. FURTHER APPLICATIONS 


which has best approximation 


~0.000845069933663 |" (7.1.2) 


| Po | 7 | 4.55591498045673 
Since we measured time in days, r represents the average change in price per day, not year. To get an annual change, 
we multiply r by 365 to get —0.308, an average decrease of approximately 31 cents per year. 

The graph of diesel price over time does not indicate a steady decline, however. While the overall trend is 
downward, there is a fluctuation as well. A more accurate model of the actual price over this time period would come 
from a model that caputures this fluctuation. For example, a model of the form f(t) = po + rt + @ sin(wt) + B cos(wt) 
might provide reasonable results since it includes a linear portion (pp + rt) to capture the overall decrease and periodic 
portion (a@ sin(wrt) + 8 cos(wr)) to capture the fluctuation. However, linear regression only approximates parameters 
that vary linearly with respect to the response and f(t) = po + rt + a sin(wt) + Bcos(wt) does not vary linearly with 
respect to its parameters. 


Crumpet 30: Linear Variation 


If the value of a function for each set of fixed inputs is a linear combination of its parameters, we say the function 
varies linearly with respect to its parameters or is linear in its parameters. Otherwise the function is nonlinear in its 
parameters. 


For example, f(1) = po +r + asin(w) + Bcos(w) is not a linear combination of po,7,a,8, and w. It does not 
take the form dgpo + air + doa + a38 + a4w. However, after fixing w = ~, w is no longer a parameter and 


fC) = potrt+sin (5245) -@+COS (5245) - Bis a linear combination of its parameters po, r, a, 6, and f is so for every 


other value of ¢ too. Setting the value w = a gives the sine and cosine functions a 5-year period. 
As a linear algebra problem, we wish to find the best approximation to Mv = b where M holds the inputs, b holds 
the responses, and v holds the unknown parameters. Again, time will be measured in days since | February 2012. 


t sin(wt) cos(wt) 
7 0.0172134 0.999852 4.36 
22 ~=0.0540754 0.998537 Po 4.40 
M-|! 382 0.807206 0.590269 we : . b= 4.36 ; 
: ; : : B : 
1 2211 —0.748605 0.663016 3.15 
which has best approximation 
Po 4.47274711624545 
r |_| —0.000907047402863 (7.1.3) 
a | | 0.71642901672004 _ 
Bp —0.225 10779212064 


The sums of the squared errors for the two models 


p(x) = 4.55591 — 845070(10) “44 
f(x) = 4.47275 — 907047(10) 41 + 0.716429 sin(wt) — 0.225108 cos(wt) 


are 3.02939 and 0.460556, respectively. The graphs of the two models superimposed on the data clearly illustrate 
how much closer f comes to predicting the observed data. 
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For a final linear regression on the February driving data, we return to the thought that mileage is affected by 
driving speed. One model that incorporates this fact is g(x, 5) = (k + Bs)x = kx + Bsx, where x is distance (as 
before) and s is average speed. The number of gallons consumed per mile, (k + Bs), varies with average speed s and 
g(0, s) = 0. This model is slightly different from the ones we have derived so far. Here, we have two input variables, 
making this a multiple linear regression, or multilinear regression. The principle is the same, however. We wish 
to find the best approximation of Mv = b where M holds the inputs, b holds the responses, and v holds the unknown 
parameters. In this case, 


x SX 
450 9782.55 13.25 
685 20401.355 18.101 


m=| 394 7130.612 | v=[ hl ba| 12.956 


441 12324.627 13.003 


which has best approximation 
| k | -| 0.0379 147381262610 | (7.1.4) 


p —0.0003465 13745672586 


and sum of squared errors 5.21664. Compare this to the 9.30835 we found without considering average driving speed. 
Given that the hypothetical trip from New Haven to Augusta is to be driven mostly on the highway, we can 
approximate the required fuel by, for example, 


g(600, 45) = (0.0379147 — 0.000346514(45))600 
~ 13.39, 


significantly different from the original estimates of over 17 gallons. The 13.39 gallon estimate should be met with 
some skepticism, however. It uses an average speed of 45 miles per hour while the highest average speed for which 
we have data is about 35 miles per hour. There is no evidence that the model applies to an average speed of 45 miles 
per hour. More data and possibly a revision to the model should be considered before using an average speed of 45. 
On the other hand, hypothesizing that it is reasonable to expect the car’s efficiency to be better at an average speed of 
45 miles per hour than it is at an average speed of 35 miles per hour, we can use the model with an average speed of 
35 to get an (expected) overestimate of the required volume of fuel. 


g(600, 35) = (0.0379147 — 0.0003465 14(35))600 
=~ 15.47 


is still considerably less than 17—and likely an overestimate. 
A linear regression model with two input parameters is, geometrically, a regression surface. A plot of g(x, s) with 
the twelve data points is shown below. 
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Normal Equations 


As presented in section 6.4, the calculation of a best approximation involves projecting onto the column space of a 
coefficient matrix, requiring an orthogonal basis for the column space. While (Gram-Schmidt) orthogonalization can 
be applied to find such a basis, the process is computationally intensive and, more detrimental to the results, error 
prone. In practice, the normal equations 

M' Mv = M'b 
are solved instead. It is known that v is a solution of the normal equations if and only if v is a best approximation to 
a solution of Mv = b. 


Crumpet 31: Normal Equations 


Letting W be the column space of M, an m x n matrix, the following statements are equivalent. 


1. ¥ is a best approximation to a solution of Mv = b. 

2. MV is the closest point to b in the column space of M. 
3. MV = projyb. 

4. b- MVis in Wt. 

5. (b- M¥)' M.; = 0 forall j = 1,2,...,n. 

6. M’M¥ = M’b. 


1 © 2 by definition of best approximation. 2 = 3 by theorem 19. 3 = 4 by the fact that (b — projyb) is in W+. 
4 = 3 by the facts that (i) b = MV + (b— M9); (ii) Mv is in W; (iii) b — M¥ is in W*; and (iv) corollary 17. 4 © 5 
by definition of W~ and the fact that each column of M is in W. 5 © 6 by matrix algebra: for each j = 1,2,...,n, 
(b- M¥)' M.; =0 8 Mi, (b- M¥)=06 MI b = M"MY =0e MI b = M!,M\¥, this last equality being true for 
all 7 if and only if M7 M¥ = M’b. 

Because the set of best approximations of Mv = b equals precisely the solution set of M7 M¥ = M’b, the linear 
system My = b has a unique best approximation for each b in R” if and only if M’ M¥ = M’b has a unique solution 
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for each b in R”. By theorem 7 M7 M¥ = M"b has a unique solution for each b in R” if and only if M7 M¥ = 0 
has only the trivial solution if and only if M’ M is invertible. Hence the linear system Mv = b has a unique best 
approximation for each b in R” if and only if M7 M is invertible. 


Solving the normal equations amounts to solving a linear system of p equations in p variables where p is the 
number of parameters (not the number of data points). The normal equations represent a relatively small system with 
known, dependable solution techniques. 

For example, the model p(t) = po + rt, which came from a best approximation with 


7 4.36 
22 4.40 
m-=| i 382 |. ee|? |, ba| 4:36 


1 2211 3.15 


can be solved by first computing 


M'M= 


12580 19526278 


12 12580 
408 12.34 


and Mp =| 44.04 | 


and then solving 


12 12580 Po | _ 44.04 
12580 19526278 r | | 40812.34 |° 


Can you provide this solution? Answer on page 236. 


Key Concepts 
least squares solution a best approximation ¥ of Mv = b, having sum of squared errors ||M¥ — bl. 


sum of squared errors given observations (Xj, y;), i = 1,2,...,N, and model y = f(X), (f(X)) - yi + (f(X2) - 
yo)? + +++ + (f(Xy) — yw)’. 
linear regression given a model of the form f(X) = 6) f\(X) + B2f2(X) + --- + Bpfp(X) and observations (Xj, yi), 


i= 1,2,...,N, linear regression refers to finding a best approximation of 


fi(%) fol) +++ fp(X) Bi YI 
fi(X2)  fal(%2) +++ fp(X2) I} fo y2 


fay BOR ss. OO) Nae) as 


normal equations M7’ Mv = Mb, whose solutions coincide precisely with best approximations of Mv = b. 
multiple linear regression linear regression with multiple input variables. 


multilinear regression another name for multiple linear regression. 


Exercises (b) g(x) = (x ni)(w— r2)(x— 73) [A]-362 


1. Is the function linear or nonlinear in its parameters? A (c) h(t) = Poe” 
parameter is any variable quantity not listed as an inde- (d) m(t) = k +rin(2zt) [A]-362 


pendent variable (input) of the function. a) . 
(e) ®-(Q) = — [Gauss’s Law of Electric Flux] 
E0 


(a) f(x) = a3x3 + ayx* + a,x +o 
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ki 6 
(f) G(m, m,r) = “nm [Gravitational Force] [A]- (d) 
r 
362 5 
(g) F(m, @) = umg cos @ [Frictional Force] where g is \ 
a gravitational constant, not a parameter. 
@ 
(h) Y(A,L,A;,) = in [Modulus of Elasticity] [A]- an ‘ i 
362 e 
2 
ii @ 
@ 
2. Based on the general shape of the graph, propose a model 1 e 
with one independent variable that could be fitted to the iam 
data using linear regression. 
0 1 2 3 4 5 
4 
(a) 9° e oe [A]-362 
e @ 
300 : 2 (e) 71 © [A}-362 
250 ° 1400 ; 
200 1200 
i 1000 t 
100 800 
= 600 
e @ 
400 
0 1 2 3 4 @ 
200 im 
-50 e 
le @ 
20 2 4 6 8 10 12 14 16 18 20 
-200 
(b) 100 
nie (f) 1 
80 
p 
0.8 
60 3 
@ 
40 f 0.6 
e@ 
ev 
20 0.4 @o 
ee °e 
@ 
Hol fan ee al eel tee ad oe 
bl e@ 
20 ee @ 
05 0 os 1 15 2 25 38 
7 
(c) ; LAl-308 3. Use normal equations to find the best fit of the model to 
the data. 
5 
(a) a) SageMath Cell 106 
i f(x) = Bax + Box? + Bix + Bo 
- eo [ele x 389 851 2.467 | 4.113 
at r f(x) | 141.1 | -35.51 | 167.1 18.3 
i x | 4525 | 6.639 | 8.873 | 11.24 
+. f@ 173 243 1039 | 1783 
° [S]-334 
10 3 4. 5 6 
) ORD 0; 
~4 
8(X) = Bo + Bix 
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x 94 3.002 | 4.837 | 7.422 
g(x) | 6.341 | 19.43 | 36.54 | 53.86 


x 8.038 | 10.06 | 13.06 | 13.89 
g(x) | 47.57 | 61.6 | 86.13 | 83.31 
[A]-362 


¢) OLD 0; 


h(x) = Bo + Bi sin(x) + Bs cos(x) 
x 17 275 525 3.185 
h(x) 10.79 10 8.533 


x 3.545 | 4.618 | 6.604 | 6.679 
h(x) | —.437 | 7.455 | 10.02 | 9.357 


() QED | 0 


J = Bo nGt) 
t 042 1.778 | 1.934 | 3.431 
J@) | -13.22 | 10.7 | 11.18 | 14.8 


t 3.888 | 5.491 | 6.57 8.98 
J@® 15.52 17.39 | 18.7 | 20.67 


() QRS | 10 


k(t, x) = Box? + Bixt + Bot? 
k(t, x) e x > 
041 306 «61.07. 1.92 
T .626 | -1.36 | 8.17 | 6.59 | 24.1 
1.01 8.47 8.88 | 16.9 38 
1.53 18.9 19 29 46.1 
1 1.63 21.4 28.9 36 50 
[S]-335 


) ORD 
x 


Bo 
Meay= l+e 
k(t, x) e x => 
53 1.27 1.95 2.15 
tT 14 | 5.31 | 10.5 | 17.6 | 37.9 
8 Al 6.39 | 21.4 | 44.1 
1.51 | 11.4 | 22.6 | 39.4 | 60.1 
{ 2.24 | 40.4 | 39.4 | 56.6 | 80.8 
[A]-362 


) OREIED | 


m(t, x) = Bo + wis 
m(t, x) e x => 
8 155 1.82 2.02 
T  .64 43) 841 | 26.4 | 35.8 
83 | 7.04 | 9.67 | 25.3 | 43.5 
1.49 | 16.7 | 19.2 | 38.8 | 58.5 
L 1.85 | 26.6 | 31.9 | 50.9 69 


(h) a) SageMathCell 113 
MX, Xz) = Bo SiN(X1X2) + By Cos(x, + xz) 
n(X1, X2) - X2 > 
29 17 1.95 2.38 
T -76 1.43 | 8.5 27 41.3 
1.22 | 9.46 | 18.3 | 33.6 49 
2.05 | 31.8 | 42.2 | 48.6 | 67.5 
J 2.1 | 29.3 | 42.6 | 57.3 | 69.6 


t 


ft 


t 


x1 


4. Redo questions 3abdef using orthogonal projection in- 
stead. [S]-336 [A]-362 


5. Calculate the sum of the squared errors for the model of 
question 3. [S]-338 [A]-362 


6. Fitting an exponential function. Many physical quanti- 
ties are related exponentially, making y(t) = ae" 
mon model in science. Unfortunately this model is not 
linear in its parameters, making linear regression impos- 
sible directly. By taking the natural log of both sides of 
the formula, however, the model becomes 


a com- 


In(y(t)) = In(a) + kt 


which is a line in the variables ¢ and Iny. Find a best-fit 
exponential model for the data by following the outlined 
steps. [A]-362 


t .6203 | 1.062 | 1.625 | 2.158 
y(t) | 25.90 | 21.77 | 18.38 | 14.64 
t 3.147 | 8.259 | 8.931 | 9.519 
y(t) | 9.905 | 2.818 | 3.022 | 2.110 


(a) Complete the following chart, filling in the loga- 
rithms of the given y values. 


t .6203 | 1.062 | 1.625 | 2.158 
Iny | 3.524 | 3.081 | 2.911 


t 3.147 | 8.259 | 8.931 | 9.519 
Iny 


(b) Fit a linear model, f(t) = Bo + it to the data in the 
chart of part (a). 


(c) Bo = In(a) and B, = k so the model is 
y(t) = 20 efi! 
Calculate a = e*. You now have the parameters a 
and k for the model. 
(d) Plot the model superimposed upon a scatterplot of 


the data to see the fit. 


7. Eyeball challenge. 


(a) Draw a line on the graph that fits the data well. 
(b) Find an equation for the line you have drawn. 


(c) Calculate the sum of the squared errors of your 
model (linear equation). 


(d) Calculate the linear regression model (of the form 
f(x) = Bo + 81x) for the data. 


(e) Calculate the sum of the squared errors for the lin- 
ear regression model. Compare it to your answer 
for 7c. How did you do with your “eyeballed” line? 
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3 normal equations. 
25 10. 116 Verify the result in (7.1.4) using 
ia ° normal equations. 


Download complete data for the diesel mileage of the Sport- 


ts a wagen TDI referred to in this section at the ancillary website, 
, °. r t section 7.1, to complete the following exercises. Use data from 
. t June (or some other month) instead of data from February. 
aS 11. Recompute the model of (7.1.1) using data from the 
month of your choosing. 
0 os 4 #45 2 28 3 35 4 45 5 


12. Recompute the model of (7.1.2) using data from the 


month of your choosing. 
The coordinates of the points are (4.13,2.35), 


(1.02, 1.12), (2.46, 1.37), (.045,.688), (1.09, 1.03), 
(2.42,1.17), (1.11,.659), (4.03,2.3), (2.72, 1.32), 


13. Recompute the model of (7.1.3) using data from the 
month of your choosing. 


(1.44, 1.25), (4.87, 2.18), and (3.07, 1.86). 14. Recompute the model of (7.1.4) using data from the 
month of your choosing. 
8. £3) Sage ath Cel] 114 Verify the result in (7.1.1) using 


15. Create your own question, propose a model to help an- 


swer it, and find the parameters of your model using lin- 
9. 115 Verify the result in (7.1.3) using ear regression. 


normal equations. 


Answers 


linear system solution The solution can be reached by row reduction or the inverse method. Since the coefficient 
matrix is 2 x 2 and it is easy enough to compute a 2 x 2 inverse, that is probably the easiest route: 


-1 


12 12580 
12580 19526278 


1 19526278 —12580 
12(19526278) — 12580? | —12580 12 


9526278 2580 7 
-| TonsROIG ~ TaNs8936 |-| 0.256726 a ae | 
—aeenOTe  GEDSEORE —1.65398(10) 1.57772(10) 
SO 
_| Po | _ 0.256726 —1.65398(10)~* 44.04 
v"| or |~] -1.65398(10)-4 —1.57772(10)-7_ |} 40812.34 
7 4.556 
~ | =8.451(10)* |? 


the same as result (7.1.2). 
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7.2 Markov Chains [3.5] 


From https://www.capitalbikeshare.com/about 


Capital Bikeshare is metro DC’s bikeshare system, with more than 4,300 bikes available at 500 stations 
across six jurisdictions: Washington, DC; Arlington, VA; Alexandria, VA; Montgomery County, MD; 
Prince George’s County, MD; Fairfax County, VA; and the City of Falls Church, VA. Capital Bikeshare 
provides residents and visitors with a convenient, fun and affordable transportation option for getting 
from Point A to Point B. 


Capital Bikeshare, like other bikeshare systems, consists of a fleet of specially designed, sturdy and 
durable bikes that are locked into a network of docking stations throughout the region. The bikes can 
be unlocked from any station and returned to any station in the system, making them ideal for one-way 
trips. People use bikeshare to commute to work or school, run errands, get to appointments or social 
engagements and more. 


Capital Bikeshare is available for use 24 hours a day, 7 days a week, 365 days a year. Riders have access 
to a bike at any station across the system. 


Capital Bikeshare makes their trip data available to the general public free of charge. The data includes (i) duration 
of trip, (ii) start date and time, (iii) end date and time, (iv) starting station name and number, (v) ending station name 
and number, and more. ! 

The following chart was processed from real Capital Bikeshare data for the year 2018. It shows the total number of 
rides that started and ended in the section of Alexandria containing bike stations 31041 through 31048. The locations 
of the stations are to the right of the chart. The total number of rides accounted for is 10, 364. 


From 

31041 31042 31043 31044 31045 31046 31047 31048 
31041 664 163 77 99 103 55 265 256 Prince St & Union St 
31042 124 519 152 159 161 101 658 710 Market Square / King St & Royal St 
31043 66 206 102 87 58 129 611 18 Saint Asaph St & Pendleton St 
31044 56 121 55 156 32 73 114 501 King St & Patrick St 
31045 41 128 41 22 70 121 64 187 Commerce St & Fayette St 
31046 20 97 98 41 76 96 70 11 Henry St & Pendleton St 
31047 172 561 568 88 64 180 182 28 Braddock Rd Metro 
31048 78 215 17 197 41 13 33 93 King St Metro South 


To 


Total: 1221 2010 1110 849 605 768 1997 1804 10364 


From these data, linear algebra can be applied to estimate the distribution of bicycles among the stations. Such 
information could be used to decide which stations to expand or reduce, where another station might be needed, the 
most likely place to find a free bike, and so on. 

The method is one of prediction over time based on percentages. Given the distribution of bikes among the 
stations (as percentages) at some time, it uses the data to predict the distribution of bikes at some fixed amount of 
time later. Assuming the annual data on bicycle movement is reflected monthly, we will use one month for the time 
step. 

Because the method works with percentages, we do not need to know how many bikes are in the neighborhood. 
It could be 25 or 250. No matter. We begin by dividing each column of the chart by the total number of rides in 
that column. The first column is divided by 1221, the second column by 2010, and so on, resulting in a table whose 
columns all sum to 1. Accurate to five decimal places, this normalized chart is collected in the matrix M: 


0.54382 0.08109 0.06937 0.11661 0.17025 0.07161 0.13270 0.14191 
0.10156 0.25821 0.13694 0.18728 0.26612 0.13151 0.32949 0.39357 
0.05405 0.10249 0.09189 0.10247 0.09587 0.16797 0.30596 0.00998 
0.04586 0.06020 0.04955 0.18375 0.05289 0.09505 0.05709 0.27772 
0.03358 0.06368 0.03694 0.02591 0.11570 0.15755 0.03205 0.10366 
0.01638 0.04826 0.08829 0.04829 0.12562 0.12500 0.03505 0.00610 
0.14087 0.27910 0.51171 0.10365 0.10579 0.23438 0.09114 0.01552 
0.06388 0.10697 0.01532 0.23204 0.06777 0.01693 0.01652 0.05155 


'See https://www.capitalbikeshare.com/system-data. 
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M thereby represents the percentage of rides starting at the station represented by the column that end at the station 
represented by the row one month later. For example, the 0.06937 in the first row, third column means that 6.937% 
of the bikes at station number 31043 will be at station number 31041 in a month. Empirically speaking, about 7% of 
the bikes at station number 31043 are destined for station number 31041 over the course of the month. We know this, 
in fact, was the case over the whole of 2018, and we are assuming it makes a good estimate of the monthly migration 
of bikes within the neighborhood. 


Multiplying M>M,.5 then represents the percentage of bikes at the station of column 5 (31045) that head for the 
station of column 1 (31041) first and then to the station of column 2 (31042). In other words, it is the percentage of 
bikes at station 31045 that can be expected to be at station 31042 two months later. Similarly, Mz2M2,5 represents 
the percentage of bikes at station 31045 that are destined for station 31042 via station 31042 (the second ride is 
from station 31042 back to station 31042) over the course of two months; M23M3.5 represents the percentage of 
bikes at station 31045 that are destined for station 31042 after being dropped at station 31043; and so on. The sum 
MoM) 5 + Mo2M25+---+M2Ms¢55 therefore represents the total percentage of the bikes at station 31045 that can be 
expected to be at station 31042 after two months. Notice that sum is just a row-column product (row 2 times column 
5), which is the 2,5-entry of M. 


Generalizing, the i, j-entry of 


0.34772 0.14568 0.14367 0.16169 0.17933 0.14162 0.14917 0.16962 
0.18009 0.25757 0.26034 0.24737 0.21787 0.22429 0.20081 0.22320 
0.09918 0.14711 0.20639 0.09837 0.11594 0.15050 0.11192 0.09361 
0.07128 0.08900 0.06889 0.13178 0.08122 0.07528 0.06639 0.10299 
0.04552 0.05952 0.05190 0.06237 0.07117 0.06664 0.05208 0.05619 
0.03239 0.05021 0.05196 0.04321 0.06101 0.07067 0.05955 0.05025 
0.15859 0.18732 0.17162 0.16729 0.20517 0.21018 0.29330 0.17834 
0.06525 0.06360 0.04523 0.08794 0.06829 0.06081 0.06678 0.12580 


holds the percentage of bikes starting at station 3104, that end up at station 31047 after two months. Likewise, the 
i,j-entry of M* holds the percentage of bikes starting at station 3104j that end up at station 31047 after k months. 
Multiplying M? by itself, 


0.22039 0.18022 0.17861 0.18487 0.18748 0.17934 0.18079 0.18673 
0.21605 0.22893 0.23122 0.22754 0.22432 0.22723 0.22265 0.22529 
0.12247 0.13283 0.13920 0.12638 0.12854 0.13309 0.12804 0.12492 
0.08042 0.08277 0.08089 0.08616 0.08189 0.08129 0.08003 0.08512 
0.05346 0.05606 0.05568 0.05638 0.05587 0.05612 0.05539 0.05599 
0.04633 0.05067 0.05076 0.04970 0.05058 0.05155 0.05181 0.04994 
0.19212 0.20053 0.19847 0.19753 0.20252 0.20392 0.21272 0.19884 
0.06877 0.06799 0.06518 0.07144 0.06880 0.06746 0.06857 0.07318 


and then M* by itself, 


0.19015 0.18855 0.18844 0.18879 0.18885 0.18851 0.18857 0.18887 
0.22448 0.22495 0.22501 0.22487 0.22483 0.22494 0.22486 0.22484 
0.12913 0.12955 0.12965 0.12942 0.12944 0.12956 0.12947 0.12938 
0.08181 0.08186 0.08185 0.08189 0.08185 0.08185 0.08183 0.08189 
0.05532 0.05542 0.05542 0.05541 0.05540 0.05542 0.05541 0.05541 
0.04985 0.05004 0.05005 0.05001 0.05001 0.05005 0.05005 0.05000 
0.20067 0.20111 0.20109 0.20102 0.20108 0.20116 0.20127 0.20103 
0.06858 0.06853 0.06849 0.06858 0.06855 0.06852 0.06854 0.06859 
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and so on, 


0.18890 0.18890 0.18890 0.18890 0.18890 0.18890 0.18890 0.18890 
0.22483 0.22483 0.22483 0.22483 0.22483 0.22483 0.22483 0.22483 
0.12944 0.12944 0.12944 0.12944 0.12944 0.12944 0.12944 0.12944 
u's = 0.08185 0.08185 0.08185 0.08185 0.08185 0.08185 0.08185 0.08185 
0.05540 0.05540 0.05540 0.05540 0.05540 0.05540 0.05540 0.05540 
0.05000 0.05000 0.05000 0.05000 0.05000 0.05000 0.05000 0.05000 
0.20104 0.20104 0.20104 0.20104 0.20104 0.20104 0.20104 0.20104 
0.06855 0.06855 0.06855 0.06855 0.06855 0.06855 0.06855 0.06855 


and 
0.18890 0.18890 0.18890 0.18890 0.18890 0.18890 0.18890 0.18890 


0.22483 0.22483 0.22483 0.22483 0.22483 0.22483 0.22483 0.22483 
0.12944 0.12944 0.12944 0.12944 0.12944 0.12944 0.12944 0.12944 
M2 = 0.08185 0.08185 0.08185 0.08185 0.08185 0.08185 0.08185 0.08185 
0.05540 0.05540 0.05540 0.05540 0.05540 0.05540 0.05540 0.05540 
0.05000 0.05000 0.05000 0.05000 0.05000 0.05000 0.05000 0.05000 
0.20104 0.20104 0.20104 0.20104 0.20104 0.20104 0.20104 0.20104 
0.06855 0.06855 0.06855 0.06855 0.06855 0.06855 0.06855 0.06855 


Notice that (i) accurate to five decimal places, M '6 — M*, and (ii) the columns of M'® are all the same! Higher 
powers of M will be no different. 

This means that after 16 months, about 18.89% of bikes from station 31041 will end up at station 31041. 18.89% 
of bikes from station 31042 will end up at station 31041. 18.89% of bikes from station 31043 will end up at station 
31041. In fact, about 18.89% of bikes from each station will end up at station 31041. Altogether, then, about 
18.89% of all the bikes in the neighborhood will end up at station 31041. Likewise, about 22.48% of all the bikes 
in the neighborhood will end up at station 2, 12.94% at station 3, and so on. No matter how the bikes are initially 
distributed, they will end up distributed this way after some time, and stay that way (so long as the empirical migration 


percentages hold). 


Crumpet 32: Why it Works 


Suppose M is a positive column-stochastic matrix. The Gershgorin circle theorem ensures that M7 (and therefore 
My) has no eigenvalue with magnitude greater than one and its only possible eigenvalue with magnitude equal to one 
is 1. Letting 1 be the column vector whose entries are all 1, note that M71 = 1 (this is equivalent to saying the rows 
of M’ sum to 1) so | is indeed an eigenvalue of M7. Hence M has dominant eigenvalue 1. Since M7 — / is real, its 
null space admits a basis of real vectors. Suppose w # K1 is a real nonzero vector in the null space of M7 — J, and 
assume w has at least one positive entry (if it does not, multiply it by —1). Now set 


Qmax = Max{a : 1 — aw is nonnegative} 


(the set is nonempty since it contains 0 and closed since the limit of a nonnegative sequence is nonnegative, so it 
has a maximum). Now u = 1 — @,,W has at least one zero entry (if it does not, then @ 4, is not maximal). But u 
is then a nonnegative eigenvector (of the positive matrix M) and hence must be positive, contradicting that it has at 
least one zero entry. Thus no such w exists, and the eigenspace of 1 is one-dimensional. A dominant eigenvalue with 
a one-dimensional eigenspace is exactly what is needed for the power method to work (section 6.2). In this case, the 
matrix M is stochastic, so instead of computing v, = M*vo, we simply calculate M*. The entries of M* will never 
tend to 0 or infintity since powers of stochastic matrices are stochastic, so we do not have to worry about scaling after 
each iteration. We can then multiply any nonzero vy by Mé* to find the approximation v, of the dominant eigenvector. 


Vocabulary 


e A positive matrix is one whose entries are all positive. 
e A nonnegative matrix is one who entries are all nonnegative. 


e A stochastic matrix is a nonnegative matrix whose columns each sum to 1. 
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Lemmas 


In each of the following lemmas, M is an n X n matrix. 

e If M is positive and w is a nonnegative eigenvector, then w is positive. 

Proof. Since eigenvectors are nonzero and w is nonnegative, w has a positive entry, say its i”. If follows that 

(Mw) j1 = os MieWe1 = MjiWin >0 
k=l 

for each j = 1,2,...,n. oO 
e The eigenvalues of M and M? are the same. 

Proof. Since the determinants of a matrix and its transpose are equal (section 3.5), for any scalar A, 

det(M — AD) = det(M — Al)’ = det(M? — (AD’) = det(M? — AD. 
Hence M and M™ have the same characteristic equation and therefore the same eigenvalues. oO 


e If the entries of M are real numbers and J is a real number, then the null space of M — Al admits a basis of 
vectors with real entries. 


Proof. Since the null space of any matrix can be found through row reduction, and row reducing a real matrix 
does not requre complex numbers, a real basis of M — AI (which has real entries) exists. oO 


e Gershgorin circle theorem: Every (possibly complex) eigenvalue of © lies in at least one disk with center M;; 
and radius 7; = ¥) jz: |Mij|, i= 1,2,...,n. 


Proof. Suppose A,w is an eigenpair of M and let w;,; be the entry of w with greatest magnitude. Because 
Mw = aw, we have for each i = 1,2,...,n, Axi, = esi M;,;x;,1. Hence 


(A— Mii) xi. = AX — Mixa = >, Mj, jx), 


Sti 


from which it follows 


< Mii: 


= 
— Xx, 
J#i 


< 1IMijl 


j#i 


1 
A - Mi.| = Fa 


» M;jxj1 
j#i 


a 
i] 


e Powers of stochastic matrices are stochastic. 


Proof. Let S be a stochastic matrix. By definition, 17S = 17 (the sum of each column of S is one). By 
induction, assume S* is stochastic for some k > 1. Then 


DSi (AE Ssh sole 


so the columns of 5S‘! sum to one. Of course powers of nonnegative matrices are nonnegative, so S‘*! is 
stochastic. o 


For example, suppose Capital Bikeshare supplies each station with the same number of bicycles to begin. That 
information can be recorded in a column vector with length eight and each entry equal to i. After one month, we 
assume that the empirical data on transitions from one station to another is reasonably accurate, so the distribution of 
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bicycles will be approximately 


0.166 
0.226 
0.116 
0.103 
0.071 
0.062 
0.185 
0.071 


00] 00] +00] +00 | +00]+ +00] 00] 
2 


Multiplying the distribution vector by the transition matrix M gives the new distribution of bicycles among the 
stations. After another month, the distribution of bicycles will be approximately 


0.166 z 0.180 
0.226 } 0.226 
0.116 0.128 
mf O83 | || J] Om 
0.062 i 0.052 
0.185 i 0.196 
0.071 i 0.073 
and after four months, 
0.180 3 0.187 
0.226 3 0.225 
0.128 3 0.129 
0.086 = 0.082 
M ~ M+) & Iw 
0.058 7 0.056 
0.052 7 0.050 
0.196 q 0.201 
0.073 5 0.069 
and after eight months, 
0.187 3 0.189 
019 | | ET | oz 
. 8 . 
0.082 = 0.082 
oM* ~ M®| § |x 
0.056 7 0.055 
0.050 [ 0.050 
0.201 g 0.201 
0.069 5 0.069 


which is, accurate to three decimal places, equal to the columns of M 16 (and M 32 for that matter). The distribution of 
bicycles does not change much after the first two months. 


More importantly, the distribution we see after 8 months will be the eventual distribution of bicycles no matter 
the initial distribution! Given any initial distribution of bicycles, 


T 
[ 1 W2 W3 W4 W5 We W7 ws | 
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(where ee w; = | and each w; is nonnegative), 


wi 0.18890 53 wi 0.18890 0.18890 
Wo 0.22483 y:8., w; 0.22483 0.22483 
W3 0.12944 Det Wi . 0.12944 0.12944 
is| Wa | | 9.08185 Det wi | _ 3 wy,{ 0208185 | _ | 0.08185 
Ws 0.05540 Yi, wi | a ‘| 0.05540 0.05540 
W6 0.05000 5%, w; | | 0.05000 0.05000 
W7 0.20104 52, wi 0.20104 0.20104 
wg 0.06855 a Ww; 0.06855 0.06855 


This means the distribution of bicycles in the long-run is given by this vector, v = Mj, the eigenvector of M 
corresponding to eigenvalue |. This distribution is called the steady-state distribution because if reached, it never 
deviates: Mv = v, so v remains steady over time. We would expect to see the bicycles distributed among the stations 
in these proportions after adequate time. 


Formalities 


The sequence of bicycle distributions in the Capital Bikeshare scenario is an example of a discrete-time Markov 
chain. Any Markov chain necessarily consists of a set of states (stations in our example), a set of probabilities that 
some object (bicycle in our example) will transition from one state to any other depending only on its current state 
after one time step, and an initial distribution among the possible states. The matrix M, where M; ; is the probability 
of transitioning from state j to state i, is the transition matrix, and each distribution v, other than vo statisfies 
Vn = M Vn-1- 

Other situations that can be modeled by Markov chains include 


1. board games whose movement is determined by the roll of a die, such as Snakes and Ladders (the spaces on the 
board are the states, the game piece is the object, and the roll of the die provides the transition probabilities); 


2. sentence construction, as used by computer auto-completion (the states are the words of a specific language, 
the object is the reader’s focus, and the probability of one word following another in a sentence provides the 
transition probabilites); 


3. aclosed economy—one where a set of commodities is produced and consumed by the same group (the states are 
the sectors of the economy, the commodities are the objects, and transitioning is interpreted as consumption); 


4. the weather (the states are weather conditions such as sunny, cloudy, and rainy, the object is the weather, and 
the transition probabilities are the conditional probabilities that one weather condition will follow another on, 
say, the next day); 


5. gambling (the states are the amounts of money the gambler could have, the object is the gambler, and the 
likelihoods of winning or losing certain amounts of money provide the transition probabilities); 


6. arrival or service times in a single server queue (the states are the possible sizes of the queue, the object is the 
queue, and the likelihoods of increasing or decreasing the size of the queue by certain amounts provide the 
transition probabilities). 


Key Concepts 


transition matrix a square matrix M where M,,; is the probability of transitioning from state j to state i in one time 
step. The entries of M are nonnegative and each column of M sums to 1. 


Markov chain a sequence of distributions arising from an initial distribution vp and the recurrence v, = Mvy-_1, 
n > 0 for some transition matrix M@.* 


2 A square matrix whose entries are nonnegative and whose columns each sum to | is also called a stochastic matrix (whether it models state 
transition probabilities or not). 
3%n a more general setting, the transition matrix may change with time, and would then be replaced by My. 
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state one of the possible conditions of the object associated with a Markov chain. 


steady-state distribution a distribution v such that Mv = v (by definition an eigenvector corresponding to eigenvalue 


1). 


properties of a transition matrix If M is a transition matrix, 


each column of M sums to one; 
each entry of M is nonnegative; 


1. 
2: 
3. 1 is one of its eigenvalues; 
4. 


none of its eigenvalues has magnitude greater than one. 


If, additionally, all the entries of some power, M*, of M are all positive, 


1. 1 is a dominant eigenvalue; 


2. the eigenspace of | is one-dimensional; 


3. each column of Mé approaches the same eigenvector, that corresponding with the eigenvalue 1, the steady- 


state vector. 


Exercises 


1. Is M a transition matrix? 


0.492 0.118 0.516 
(a) M=]| 0.346 0.819 0.361 
0.472 0.063 0.123 


0.09 0.686 0.168 
(b) M=| 0.908 0.036 0.807 | [S]-339 
0.002 0.278 0.485 
0.409 0.696 
(c) M=| 0.179 0.156 
0.412 0.148 
0.826 0.006 0.235 
(d) M=| 0.104 0.122 0.609 | [A]-362 
0.07 -0.72 0.156 
0.485 0.145 -0.58 
(ec) M=| 0.302 0.46 -0.22 
0.213 0.395 -0.20 
0.341 0.104 0.217 
(f) M=| 0.525 0.592 0.249 | [A]-362 
0.134 0.304 0.534 


2. Given transition matrix M and initial distribution vector 


Vo, (i) calculate M? and M?°; (ii) determine the probability 
of transitioning from state | to state 2 over the course of 
one time step, two time steps, and three time steps; (iii) 
determine the probability of being in state 2 after three 
time steps given equal likelihood of starting in any state; 
(iv) calculate v3; and (v) for the given initial distribution, 
Vo, what is the probability of being in state 2 after three 
time steps? 


(a) QEESEMSD 7 wv = | 0.983 0.139 | 


0.017 0.861. |’ 
_[ 0.484 
Yo=| 0.516 


( OREEEED is a = | 0953 0846 | 


0.947 0.154 
0.653 
=| ee | [A]-362 
_ [ 0.484 0.653 |. 
© @ oe bee 0.347 |’ 


‘Lo 
Vo = 0 
) ORD 1. 


0.403 0.357 0.024 
M=) 0446 0.116 0.24 |; 
0.151 0.527 0.736 


0.105 
Vo = 0.019 
0.876 


ce) QRS > 


0.151 0.355 0.163 
M=]| 0.21 0.637 0.56 |; 
0.639 0.008 0.277 


0 
Vo = 1 


0 


) ORD |». 


0.385 0.3 0.429 
M= > 


[A]-362 


0.049 0.651 0.378 
0.566 0.049 0.193 


0.578 
Vo =| 0.269 


0.153 


. For any square matrix M, explain why M and M7 have 


the same eigenvalues. [A]-362 


. Every transition matrix is guaranteed to have | as an 


eigenvalue (M7 has 1 as an eigenvalue since M71 = 1). 
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Find the eigenspace of 1 for the transition matrix in ques- 
tion 2. [A]-362 


Calculate M*? for the transition matrix of question 2 and 
compare it to your answer in question 4. | A]|-363 


Find the steady-state distribution for the transition matrix 
in question 2. [A]-363 


For the simplified Snakes and Ladders board’, a 6-sided 
die with only the numbers 1,2, and 3 on it is used (two of 
each, making rolling a 1,2, or 3 equally likely). 


The rules for Snakes and Ladders can be found at 
GameRules.com. [A]-363 


(a) Create a 5 x 5 transition matrix. Each block on the 
board represents one state (location of the playing 
piece). It is impossible to end a turn on spaces 
3,4,5, or 8 since the playing piece immediately 
slides up or down from there, so only the other 
5 states need to be included. Transitioning from 
state 9 is represented by a column with 4 zeros and 
a one, indicating that once square 9 is reached the 
playing piece never leaves! Mathematically, state 
9 is called an absorbing state (a state that once 
entered cannot be left). 


(bo) EEBEMED what is the likelihood of reach- 
ing the goal (square 9) in one roll? two rolls? three 
rolls? ten rolls? 


(c) Assuming the columns (and rows) of the transition 
matrix represent states 1,2,6,7,9 in that order, the 
eigenvector corresponding to the eigenvalue 1 is 


[ 0 00 0 1 |[- What does this mean in 
the context of the game? 


Create your own Snakes and Ladders board and repeat 
the analysis of question 7. 


The Leontief model for a closed economy is similar to a 
Markov chain where each column of the transition matrix 
represents a sector’s consumption of each commodity as 
a proportion of the economy. Consequently the matrix 


4Snake, ladder, start, and goal clipart from PublicDomainVectors.org. 


is commonly called a consumption matrix rather than a 
transition matrix. Suppose an economy has three indus- 
tries: farming, building, and clothing. For every dollar 
of food produced, the farmers use $0.375, the builders 
use $0.35, and the tailors use $0.275. For every dollar of 
building, the builders use $0.286, the farmers use $0.214, 
and the tailors use $0.5. For every dollar of clothing pro- 
duced, the tailors use $0.348, the builders use $0.326, 
and the farmers use $0.326. [A]-363 


(a) Since all of the production of these three industries 
is consumed by the three industries themselves, the 
consumption matrix has columns that sum to 1. 
Write down the consumption matrix for this econ- 
omy. 

(b) A production vector v represents the production of 
each industry in dollars. What does the vector Mv 
represent? 


(c) Letv = [ 10 57 33 I and calculate Mv. At 
these production levels (which are in thousands of 
dollars), is there a sector that consumes more than 
it produces? How about a sector that produces 
more than it consumes? 


(d) How do you know there is an “everybody is happy” 
production vector v, one such that Mv = v? 


(e) Calculate the “everybody is happy” production 
vector. In terms of the economy, why might it be 
referred to as such? 


(f) Any solution of Mv = v is, in the context of the 
problem, called an equilibrium. Explain in terms 
of the economy what is in equilibrium at produc- 
tion levels in such a v? 


10. Quito, the capital of Ecuador, is located essentially on the 


11. 


equator, and therefore does not experience seasons. The 
weather is more or less the same year round—average 
low temperature about 50°F and average high tempera- 
ture about 68°F all 12 months of the year. Clouds and 
rain are similarly predictable. Suppose you are visiting 
Quito and a friend of yours claims it rains there one quar- 
ter of the time. For two months you record the weather. 
Your data suggest that on a rainy day there is a 4 chance 
it will be rainy again the next day and on a dry day there 
isa 3 chance it will be dry again the next day. Are your 
observations consistent with your friend’s claim? Cre- 
ate a Markov chain model of the weather and use it to 
answer the question. 


Ina very simple gambling game, you win one dollar with 
probability 2 and lose one dollar with probability 2, You 
start with one dollar, and the game ends when you either 
go broke or win 3 dollars (have 4 dollars). 


(a) Model the playing of this game with a Markov 
chain. The states are the numbers of dollars you 
can have and the transition matrix is formed from 
the probabilities of going from one amount of 
money to another. What is the transition matrix, 
M? 


7.2. MARKOV CHAINS [??] 


245 


(b) 


(c) 


(d) 


(e) 
(f) 


(g) 


(h) 


What is the probability the game will last more 
than 10 rounds? That is, with what probability will 
you have neither $0 nor $4 after 10 rounds? 


After 10 rounds, what is the probability you have 
gone broke? 


After 10 rounds, what is the probability you have 
won $3? 


Should you play this game? 


Find basis vectors for the eigenspace of eigenvalue 
1. 


Compare the basis vectors to the columns of M%. 
What do you notice? 


Let p be the probability of winning one dollar in a 


@) 


round. What value of p gives you a 50% chance of 
winning $3 within 10 rounds? 


How does p change if the game ends only when 
you either go broke or win $4 instead of $3? Is the 
value of p that gives you a 50% chance of winning 
$4 within 10 rounds greater or less than before? 


12. Redo the bikeshare analysis for 2020 in the same section 
of Alexandria as done in the text. Data can be retrieved 
at the ancillary website. How different is the expected 
bike distribution in 2020? Can you think of a reason the 
data are likely to be very different from the 2018 data? 


13. 


Redo the bikeshare analysis for 2018 in a different neigh- 
borhood. Data can be retrieved at the ancillary website. 
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Figure 7.3.1: Mathematical Model of a 440 Hz Simple Harmomic Sound Wave 
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7.3 Fourier Series [4.6, 6.4, calculus] 


Sound is the perception of pressure variation. Tuning forks, speakers, musical instruments, voice boxes, whistles, 
and anything else that makes sound must somehow cause varying pressure. One common way to create pressure 
variation is through physical vibration. Tuning forks, speakers, the strings of stringed instruments, and vocal cords all 
use this technique. Their vibrations cause alternating moments of compression (increased pressure) and rarefaction 
(reduced pressure) in the air. 


The greater the difference between high and low pressures, the louder the sound. Sometimes the pressure dif- 
ference, called volume or intensity, is so great, our whole bodies vibrate. Thunder, subwoofers, fireworks, and 
helicopters can do this, but for most sounds it is only our eardrums that perceive the pressure variation. 


The faster the pressure alternates between high and low, the higher the pitch. Middle C, for example, is the result 
of pressure varying from neutral to high to low and back to neutral approximately 261.6 times per second. Each 
variation through neutral, high, low and back to neutral is one cycle, so we also say that middle C has a frequency 
of 261.6 cycles per second. One Hertz, abreviated Hz, equals one cycle per second, so we also say middle C has a 
frequency of 261.6 Hz. The highest note on a piano, Bg, has a frequency of about 7902.1 Hz and the lowest note on a 
piano has a frequency of about 16.35 Hz. 


The human ear is capable of perceiving frequencies between about 20 Hz and 20,000 Hz. Air pressure can 
certainly alternate between high and low pressures slower than 20 cycles per second and faster than 20,000 cycles per 
second. Our ears just won’t pick up those vibrations. Dog whistles emit sound between 23,000 Hz and 54,000 Hz[5], 
all above the range of human hearing but within the range of canine hearing. Elephants were the first land animals to 
be observed to produce sound below the range of human hearing[19], creating calls with frequencies as low as 14 Hz. 


A sound wave can be modeled by a record of the pressure it causes on a receiver such as a microphone or eardrum 
over time. The simplest type of sound waves are simple harmonic vibrations—one intensity, one frequency, shifting 
from high to low pressure smoothly as a sine curve. Until the advent of electronics, the closest approximation to a 
simple harmonic vibration was the sound of a tuning fork. Over the course of a second or so, neither the frequency 
nor the intensity of a vibrating tuning fork changes appreciably, and the vibrations are sinusoidal. The graph of a 440 
Hz sine wave is a mathematical model of this sound. See figure 7.3.1. 


Naturally produced sounds are not so neat. Even a tuning fork does not produce sound in a perfect sine wave. 
Its intensity decreases continuously as it rings, and air particles do not compress and rarefy in a perfectly sinusoidal 
pattern. The matter is even more complex for musical instruments. Even on a stringed instrument where a string 
vibrates at a steady frequency, the body picks up the vibration and imparts its own signature frequencies to the sound. 
Wind and percussion instruments are the same. The richness of their sound comes from a variable intensity patchwork 
of many frequencies. The graph of two cycles of an actual recording of a singing drum are shown in figure 7.3.2. The 
wave is clearly not sinusoidal, featuring 8 peaks and 8 valleys per cycle. In a sense, the best sinusoidal approximation 


of this sound wave is shown below as f1(t) = —5498.21 sin (29 , nt), a frequency of 5 -29 - $820 = 1056.9 Hz. 
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Figure 7.3.2: Two cycles of the sound of a drum. 
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As expected, this sine wave does a poor job of approximating the drum wave. But we can do better. Allowing the sum 


of two sine waves, we can improve the approximation considerably, shown below as f2(t) = —5498.21 sin (29 . $820 mt) 
4891.4 sin (28 - 22,1), a combination of frequencies 1056.9 Hz and } - 28 - 58 ~ 1020.5 Hz. 

Oo 

5 

wn 

A — Drum 

a. 


2 


time(sec) 


The approximation now peters out as does the drum wave, and the peaks and valleys match better. Allowing a 
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combination of four sinusoidal waves, 


8820 8820 
f4(t) = — 5498.21 sin (os cre n] — 4891.4 sin [2s “721 n] 
8820 8820 
+ 4394.8 sin (x . | ~ 3469.0 sin (3 721 x) 


the approximation continues to improve, as seen here. 
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The more sinusoidal waves allowed, the better the approximation. The differences largely disappear with the al- 
lowance of 14 sinusoidal waves: 
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In order of greatest to least intensity, the frequencies represented in f14 are 1056.9, 1020.5, 1202.7, 1129.8, 984.0, 
1166.3, 291.6, 1640.1, 1093.4, 1457.9, 1567.2, 1275.6, 328.0, and 1239.2 Hz. For the brief moment represented in 
the graph (425 =~ 0.01372 sec), this means the sound of the drum was dominated by these 14 frequencies. 

Similarly examining a full 2 seconds of the audio reveals that the overall dominant frequencies of this particular 
drum sound, in order from most to least dominant are 290, 1177, and 1027 Hz. The note played was likely D4, whose 
frequency is 293.66 Hz, equivalent to the D string on a violin or viola played open. 

But how do we know which frequencies and intensities to use in approximation? In principle, the answer is simple. 
Theorem 18 of section 6.4 provides the guidance. Orthogonal projection of the function gives the right intensities. 
All we need are an appropriate inner product space and a basis for some subspace. Except in extreme cases, pressure 
waves, and therefore sound waves, vary continuously, so it makes sense to consider the vector space of continuous 
functions over some interval. In particular, we will consider subspaces of C ([0, L]), the set of all functions which are 
continuous on the closed interval [0, L]. 


Much like the inner products of exercises le and If of section 4.6, (f, g) = i FS (x)g(x) dx defines an inner product 
on C ([0, Z]). Can you justify it? Answer on page 255. 
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Vectors in C ({0, Z]) are continuous functions, so a basis for any subspace will have to be a collection of func- 
tions that are continuous on [0, L]. As suggested by their use earlier, collections of trigonometric functions provide 
convenient bases. For all m = 0,1,2,... and alln = 1,2,... 


(cos (1) cos (1) =0 (7.3.1) 
(sin (71), sin (1) =i) (7.3.2) 


whenever m # n. That is, for any distinct positive integers m and n, cos (1) and cos (22) are orthogonal, as are 


and 


. mm 


sin( t) and sin (47). Inner products (7.3.1) and (7.3.2) can be verified with the help of two trigonometric identities. 


Recall that cos(a@ + B) = cosa cos ¥ sina sinB, so 

; [cos(a@ — 8) + cos(a + B)] = cosa cosB (7.3.3) 
and ; 

5 [cos(a@ — 8) — cos(a + B)] = sina sin£. (7.3.4) 


Can you use (7.3.4) to show (7.3.2)? Answer on page 255. And now we have our candidates for a vector space, 
C ({0, L]), and basis elements, sin (2) with m > 0 or cos a) with m > 0. 

When the function of interest takes the value zero at both endpoints of the interval, as is the case for the sound 
waves we have looked at, it makes best sense to use sine functions for a basis. Every sine function of the form sin ( a ) 
takes the value zero at both t = 0 and tf = L so the approximation is exact at the endpoints no matter how many basis 
elements are used. If the function were nonzero at either endpoint, it would make more sense to use cosine functions 


for a basis as this allows approximation of the nonzero endpoint(s). 


Crumpet 33: A theorem of Fejér 


[Fejér, 1900] Let f be a continuous function on [—1,7] for which f(—m) = f(a). Then the sequence of Cesaro 
means of the partial sums of the Fourier series for f converges uniformly to f on |—1,7].[22] This theorem applies 
to any continuous function on [0,7] by extending it as an even function over [—7, z] (in which case the sine terms 
all have zero Fourier coefficients, yielding a cosine series) or, when f(0) = f(a) = 0, as an odd function over 
[-2, 2] (in which case the cosine terms all have zero Fourier coefficients, yielding a sine series). If f additionally 
has a piecewise continuous first derivative, then the sequence of partial sums of the Fourier series for f converge 
uniformly to f on [—7, 2]. For many physical applications, such as the sound waves discussed here, the functions we 
are trying to approximate have continuous first derivatives and therefore their extensions have piecewise continuous 
first derivatives, and the theorem applies. By proper scaling, these results can be modified to apply to domains such 
as [-L, L] or [0, L] . 


The upshot of results like this is that there exist bases of trigonometric functions for subspaces containing vectors 
(linear combinations of trigonometric functions) arbitrarily close to a given vector (function). In other words, certain 
functions can be approximated with arbitrary precision using sums of sines and cosines. 


Given a continuous function, we choose a set of sine functions or a set of cosine functions as basis for a subspace 
and we project onto the subspace to find the best approximation. For example, consider approximating f(x) = 
x(x — 1)(x — 2) over the interval [0,2]. How closely can we approximate f by vectors in the spans of 


1. By = {1, cos (51) , COS (xt) (from the family of cosine functions with m = 0, 1,2)? 
Answer: The best approximation is V = proj janig,)t° 


(fl), (fre0s( 54) i 
(1,1) (cos (1) , COS (:)) 2 (cos (zt) , cos (zt) 


cos (af). 
2 
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Using a computer algebra system to help with the integration, 


=eos(51 
cos|{—t}. 
2 
Due to symmetry, the first and third terms are zero. The distance between v and f (see section 4.6) is 
d(f,v) = lf - vil= Vf —v, f — v) = 0.0299. 


The graphs of f and v are shown below with the area between the two shaded over the interval [0, 2]. 
Since the distance between f and v involves integrating (f — v), the shaded area does not represent the 
distance between the vectors exactly, but it helps give a visual sense of this distance. 


i f 


ne 


= -16 — 
Vv i 


2. Bo = {sin (51) , Sin (zt) , sin (22)} (from the family of sine functions with m = 1,2,3)? 


Answer: The best approximation is W = proj.pania,) f°: 


(fsin(31)) ey (fisin()) (fsin($)) (3x 
(sin (1) sin (31)) 7) Gaeeaa (sin (241) , sin (221)) in| 2 , 


2 2 


Using a computer algebra system to help with the integration, 
12 
vo, sin(7x). 
1 


Due to symmetry the first and third terms are zero again. The distance between w and f is 


d(f,w) = Vif —w, f — w) ~ 0.0026. 


The graphs of f and w are shown below with the area between the two shaded. As the distance calculation 
suggests, this approximation is closer than the last. 


if 
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3. By = {sin(%2) : m= 1,2,..., 10}? 
Answer: The best approximation is X = proj janig,)t° 
10 Gs sin (“7)) mr 
2, (sin (#1), sin (#0) a) 


Using a computer algebra system to help with the integration, 


sin(5ax) + 


2 4 12 
x= 10573 = sin(47x) + on sin(37x) + = sin(27x) + 3 sin(7x). 


Due to symmetry the odd terms are zero. The distance between x and f is 


d(f,x) ~ 0.00000572. 


The graphs of f and x are shown below. The area between the vectors over the interval [0, 2] is essentially 
imperceptible. 


The choice of subspace, and therefore its basis, makes all the difference in how close the given function might be 
approximated. 
In the case of the sound wave that opened this section, coefficients 


(Drum, sin (m . sm nt) 


: 8820 : 8820 , 
(sin (m 7 Se at) , Sin (m : se nt) 


m= 1,2,...,120 


were calculated and sorted from greatest magnitude to least. In decreasing order, the top 14 magnitudes were from 
the coefficients corresponding to m = 29, 28, 33, 31, 27, 32, 8, 45, 30, 40, 43, 35, 9, 34, so the set 


2 
{sin (n : =] :m = 29, 28, 33, 31, 27, 32, 8, 45, 30, 40, 43, 35, 9, sa} 


and subsets were chosen as bases to create the approximations. The frequency of any harmonic function, sin(wx) or 
cos(Wwx), iS =< so the frequencies of these basis elements are 


{m . -~ 1m = 29, 28, 33, 31, 27, 32, 8, 45, 30, 40, 43, 35, 9, s4| 


= {1056.9, 1020.5, 1202.7, 1129.8, 984.0, 1166.3, 291.6, 1640.1, 
1093.4, 1457.9, 1567.2, 1275.6, 328.0, 1239.2} 


as noted previously. 
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Figure 7.3.3: Recording of “linear algebra rules” as displayed by Audacity 


For a function f defined over the interval [0, L], the (infinite) series 
. (7 Z Tv : Tt 
by sin (x) + bo sin (2 . x] + b3 sin(3 . *:) fee 


where 


ae : 
m (sin(m- =f) , sin (m- 1)) 


is called the Fourier sine series for f. The (infinite) series 
T T 
ayo +a, cos (¥ x) + a2 cos (2. *x)+ ee 
L L 


where 
fe cos (m . *1)) 

(cos (m . 1) , COS (m . *1)) 
is called the Fourier cosine series for f. The a,, and b,, are called Fourier coefficients, and the functions sin (m . x) 
and cos (m . x) are called the m'” harmonics. Most of our effort to now has concentrated on the sine series since we 
have been examining functions for which f(0) = f(L) = 0. In general though, any piecewise continuous function can 
be approximated arbitrarily closely using a finite number of terms of either the Fourier sine series, as we have done, 
or the Fourier cosine series. The fit will simply be better with certain selections of harmonics. 

A general Fourier series involves both sine and cosine functions and is defined over domains symmetric about 
zero. For any function f defined over the interval [—L, L], the (infinite) series 


an = 


ata, cos {=} +b, sin (7x) + dz COS (2 . x] + bo sin (2. x] eee 
L L L L 


where 


Cz cos (m ; *1)) ( sin (m : 1)) 
(cos (m . xf) ,cos|{m- *1)) (sin (m . =f) , sin (m . *1)) 
is called the Fourier series for f. Indeed, the term Fourier series, with no qualifiers, refers to this series, not the sine 


or cosine series. As before the a,, and b,, are called Fourier coefficients and the functions sin (m . x) and cos (m . x) 
th 


and by, = 


an = 


are called the m™ harmonics. 

Whether any of these series converges to something useful is a deep and interesting topic of analysis. Generally, 
though, piecewise continuous functions can be approximated arbitrarily closely using a finite number of terms from 
any one of the Fourier series. Because sines and cosines are periodic, especially good approximations with small 
numbers of harmonics can often be found for functions f where f(—L) = f(L) (for general Fourier series) or f(O) = 
F(Z) (for sine series). 

It is not only functions that show some regularity or symmetry that can be approximated by Fourier series, how- 
ever. The real power of Fourier analysis is to suss out the most important frequencies when none are apparent. 
Circling back to sound waves, figure 7.3.3° shows the full 1.82 seconds of a voice saying “linear algebra rules”. The 
sound wave shows no particular symmetry or regularity since the sound is constantly changing throughout. 


5 Audacity: https://www.audacityteam.org/ 
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The first 16,095 Fourier sine series coefficients were calculated. The following graphs of the “linear algebra 
rules” audio and its approximations over two separate time intervals illustrate how extending the basis improves 
the approximation. Fl, F74, F637, and F16095 are the approximations using the dominant 1, 74, 637, and 16095 
harmonics, respectively. 


12000 
10000 
8000 
6000 


4000 —F1 


—F74 
F637 


0 fe Te saga ; ——= —F16095 
-2080" : = ~\ : — Original 


pressure 


—F1 
1.2085 —F74 
——F637 
—F16095 
— Original 


pressure 
B 
iy 
oS 


time (sec) 


F16095 is barely visible beneath the original curve, suggesting that the approximation is very good (which should 
probably be expected having used so many terms). It may be surprising then that the distance between F16095 and the 
original wave is about 254.4, a number that may seem large. Distances are relative to the function being approximated, 
though. The norm of the original sound wave is 3493.5, so d(f16095, original) is only about one fourteenth (3.7%) 
the norm (size) of the original—not bad. On the other hand, the distances between the original and F1, F74, and F637 
are 3479.5, 3032.0, and 2084.8, respectively. As their distances are similar in magnitude to the norm of the sound 
wave itself, it should be expected that they do a poor job approximating the original sound wave. This expectation is 
born out by the graphs. 

In the end, though, the sound of the reproduction should be the judge of the quality of the approximation. The 
ancillary website contains all the data and playable sound files for the sounds mentioned in this section as well as 
several others, such as boiling water and birds chirping. The audio corresponding to F1 is a simple computer tone and 
does not resemble the original audio at all except that it captures the overall pitch. The sound of F74 is slightly better 
in that it oscillates, but in no way sounds like speech. The words can clearly be heard behind a noisy foreground in 
F637, but it is still a poor reproduction. Think Alexander Graham Bell and his first phone call. Finally, F16095 and 
the original audio are indistinguishable, at least to my ear. Have a listen! 


Key Concepts 
Fourier series for functions f defined over [—L, L], the series 


dao+a, cos (*} +b, sin (7x) + dz COS (2 . x} + bo sin (2. x] fee 
L L L L 


where 


7 cos (m . *1)) cad $= 


(cos (m : 1) , COS (m : *1)) (sin (m : 1) ; at : <1) , 


Fourier sine series for functions f defined over [0, L], the series 


an = 


baa x) +bysin(2-7x)+--. 
i i 


254 CHAPTER 7. FURTHER APPLICATIONS 


where 


fe sin (m t)) 


(sin (m- zr) : ae . *1)) 


Fourier cosine series for functions f defined over [0, L], the series 


Din = 


Haseeal a) eee 
cos | — cos ices eel 
agra, ve a2 CO L 


where 


(F,cos(- 1) 


(cos (m- £1) cos (m- 21) 


Fourier coefficients the a,, and b,, of a Fourier series, Fourier sine series, or Fourier cosine series. 


an = 


harmonics the functions cos (m . 21), and sin (m . 1) appearing in Fourier series are called m’" harmonics, or simply 
harmonics when not referring to any particular frequency. 


approximations piecewise continuous functions can be approximated arbitrarily closely using a finite number of 
terms from any one of the Fourier series. Especially good approximations with small numbers of harmonics 
can often be found for smooth functions f where f(—L) = f(L) (for general Fourier series) or f(0) = f(L) (for 
sine or cosine series). 


Fourier analysis the process of determining a large selection of Fourier coefficients with the purpose of identifying 


those with some particular characteristic. 


Exercises 


1. Argue that C ([0, Z]) is a vector space. [A]-363 
2. Justify (7.3.3). 


3. Why do we not include the case m = 0 in the family of 
harmonics for Fourier sine series? [A]-363 


4. Show that, in the inner product space C ([0, L]), 


(a) (I=L 
(b) (cos(m- £1) ,cos(m- 22)) = 5 for m = 
1,2,... [S]-339 


(c) (sin(m- 21) ,sin(m- *1)) = 5 form = 1,2,... 


5. Find the Fourier sine series for f over the given inter- 
val. Use symmetry whenever possible to help with the 
calculations. 


(a) f(x) = 1; [0,1] 
(b) f(x) = x; [0,1] [S]-339 


(©) f= ‘ ia 
l-x x>1/ 
(d) f(x) = e*; [0,In2] [A]-363 


[0 1 
7° 10, 


6. Find the Fourier cosine series for f over the given inter- 
val. Use symmetry whenever possible to help with the 
calculations. 


(a) f(x) = $--x; [0,1] 
(b) f(x) = x; [0,1] [S]-339 
x x<l1/ 


2 
; [0,1] 
l-x x>1/2 


(c) fx) -{ 


(d) f(x) = e*; [0,In2] [A]-363 
7. Find the Fourier series for f over the interval [-1, 1]. 


(a) f(x) = 1 [S]-340 


(b) fy) =x 
(c) f(x) = lal 
(d) f(x) =e" 
(e) fi) = a 
xX 


8. 6 Create graphs of f and the Fourier series of question 
5 with the first (i) 1 nonzero term; (ii) 2 nonzero terms; 
and (iii) 5 nonzero terms. |A\|-363 


9. Oo Create graphs of f and the Fourier series of question 
6 with the first (i) 1 nonzero term; (ii) 2 nonzero terms; 
and (iii) 5 nonzero terms. | A\|-364 


10. o Create graphs of f and the Fourier series of question 
7 with the first (i) 1 nonzero term; (ii) 3 nonzero terms; 
and (iii) 5 nonzero terms. |A\|-365 


11. Reproduce a sound wave, part 1. 


(a) Download one of the data spreadsheets of the 
sounds on the ancillary website. 


(b) Sort the Fourier sine series coefficients by decreas- 
ing magnitude. 


(c) Using the twenty harmonics with greatest magni- 
tude coefficients, reproduce the sound wave as a 
sum of these twenty sine functions. 
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(d) Graph the original sound wave and its reproduction computer program that implements a numerical in- 
on the same set of axes. tegration technique. The trapezoidal rule will suf- 


12. Reproduce a sound wave, part 2. fice, for example. 


(a) Grab about 600 samples (6/441 sec) from one of (c) Using the twenty harmonics with greatest magni- 


the data spreadsheets of the sounds on the ancil- tude coefficients, reproduce the sound wave as a 
lary website. sum of cosine functions. 


(b) Compute the Fourier cosine series coefficients for (d) Graph the original sound wave and its reproduction 
the first 100 harmonics. You will need to write a on the same set of axes. 


w~ 


Answers 


inner product on C ({0, Z]) The properties of an inner product are justified one by one below. 


1. For any function f in C((0,LZ)), (f, f> = fr f(x) dx > 0 since f?(x) > 0 for all x in [0,L]. In other 
words, f(x) is nonnegative, so its definite integral is nonnegative. 


2. Of course, if f(x) = 0 (that is, f = 0), then (f, f) = f figs ee fie 0 dx = 0. Now suppose f # 0. That 
is, there is some Xo in [0, LZ] for which f(x) # 0. Let z = f°(x0) > 0. Since fF is continuous, there is a 6 
such that whenever |x — xo| < 6, x in [0, L], [f?(x) - fo) 7 |f?(x) -—2|< a This establishes an interval 
I of width at least 6 within [0, L] where f?(x) > 5 80 (f, f) = fr fG)dx> {Po dx = 65 > 0. Hence 
if f # O then (f, f) # 0, or contrapositively if (f, f) = 0 then f = 0. 

: L 

3. For any f, g in C ((0, L]), (f,g) = f 

mutative. 


4. For any f, g,h in C ((0, L)), 


F(x)g(x) dx = fie g(x) f(x) dx = (g, f) since multiplication is com- 


L L 
(f + 8,h) = { (f(x) + g(x))h(x) dx = { (FO)A(X) + gOxh(x)) dx 


L L 
= i S(x)A(x) dx + { g(ayh(x) dx = (f,h) + (8, h) 


by the distributive property for real numbers and a standard result of calculus. 


5. For any f,g in C([0, Z]) and any scalar c, (cf,g) = i cf(x)g(x) dx = cf S(xg(x) dx = c(f,g) by a 
standard result of calculus. 


sine functions are orthogonal By (7.3.4), 


sin a) sin (=:) = ; [cos (m - n=) — cos (im + n=) 


_ (mn _ (nn L oimm\ | (nt 
(sin(“:),, sin(7)} = { sin(“1) sin (71) ar 
L L 0 L L 


= if cos ((m ~ n)1) — cos ((m +n)a1)| dt 
0 


_l L 
a. m(m — n) 


sO 


sin (m - n=) > Gen sin (mn + mei) 


Il 
S 
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7.4 Discrete Dynamical Systems [3.2, 3.3] 


You have just finished chopping, slicing, mixing, blending, marinating, layering, and otherwise preparing your fa- 
vorite dish. You are ready to place it in the oven when you realize it has not been preheated. Preheat now, or put your 
assembled dish in, start the oven and guess how long it will take to properly bake? Neither! Model the situation with 
a discrete dynamical system and know just how long to put it in a cold oven. 

After doing the experiment once, you will know just what to do next time you forget to preheat. For the ex- 
periment, place a thermometer probe in the empty oven approximately where you will later put the brownies. Note 
the temperature of the probe (air inside the oven) and begin preheating the oven. Record the thermometer reading 
every 30 seconds until the oven reaches the target temperature (likely 350°F/175°C for brownies but you may want 
to coninue to higher temperature if you are baking something else). 

Make a graph of temperature versus time. If your oven is like mine, you will get a curve that looks something like 
this.° 


Empty Oven, Preheating 


Temperature (F) 


0 2 4 6 8 10 12 14 16 


Time (min) 


Curiously there is essentially no heating during the first 90 seconds, after which the temperature increases in a very 
steadily linear fashion at about 21°F per minute. We will use this observation to model the temperature of the oven 
during preheating. 

Remove the probe from the oven, let it cool, and let the oven return to its preheated temperature. When the probe 
has cooled to room temperature and the brownie mix has been poured into the brownie pan, insert the probe into the 
brownie mix. Record its internal temperature and put the brownies in the preheated oven. Record the thermometer 
reading every minute until the brownies are done. You will notice the heating is not linear. 

Newton’s law of cooling, which applies equally to heating, suggests that the change in temperature of a body is 
approximately proportional to the difference between the temperature of the body and the temperature of its surround- 
ings, ambient temperature. As an equation, 


AT = k(M—T). 


If brownies obey this law, plotting the temperature over time will reveal a concave down graph. As the brownies’ tem- 
perature increases, the difference between ambient (oven) temperature and brownie temperature, (M — T), decreases. 
In turn, the change in temperature over a fixed amount of time, AT, will also decrease. This is, at least as a general 
characteristic, exactly what the data provide! 


Data, graphs, and calculations for this entire discussion are available in a spreadsheet at the ancillary website. 
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Brownies in a Preheated Oven 
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However, if the brownies truly follow Newton’s law of cooling, a plot of M — T versus AT will reveal a straight 
line passing through the origin, just as any two directly related variables will. Alas this is not what the data suggest. 


Brownies in a Preheated Oven 


AT (F) 


120 140 160 180 200 220 240 260 280 300 
M-T (F) 


The scatterplot looks quite linear, but clearly would not pass through the origin if extended to M — T = 0. Unfor- 
tunately, this is a critical feature of the law. When there is no difference between the temperature of a body and its 
surroundings, the body will neither heat nor cool. Laying a glass of water on the counter for hours, days, or weeks, it 
will remain at room temperature for the duration. 

Nonetheless, this is what the data are telling us, law or no law. We apply linear regression (section 7.1) to the 
data, deriving a model of the from AT = ag + a1(M — T) that applies when 146°F < M —T < 284°F. The normal 


equations are 
26 4936 a |_| 139 
4936 977404 a | | 30395 


a |_[ 26 4936 |'{ 139 |_| -13.516 
a, | | 4936 977404 30395 |~ | .099356 |” 


and have solution 
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Hence the temperature of the brownies can reasonably be modeled by AT = —13.516 + .099356(M — T). The graph 
here illustrates the reasonableness of this model. 


Brownies in a Preheated Oven 


H Actual 
Model 


AT (F) 


140 160 180 200 220 240 260 280 
M-T (F) 


This model looks very good for temperature differennces, M — T, above 146°F, but putting brownies in a cold 
oven requires a model for much smaller temperature differences. After all, the brownie mix and the unheated oven 
are essentially the same temperature to begin, M — T ~ 0. Luckily Newton’s law of cooling applies for “small” 
temperature differences. We can conclude that for our brownies the 350°F oven provides a temperature difference 
outside the range of “small”. Without any further data, we will assume that Newton’s law of cooling applies at 
temperature differences less than 146°F (the smallest observed temperature difference). According to our model, at 
M—-T = 146, we have AT = —13.516 + .099356(146) = 0.98998 = 1. Since Newton’s law implies that AT and 
M - T are directly proportional, we arrive at the simple relation AT = (mM —T)for0<M-T < 146. 

Finally we are ready to run a simulation and find out just how long the brownies should bake starting in a cold 
oven. From the observation of oven temperature during preheating, we have 


AM = 21 


90 seconds or more into heating (and AM = 0 prior since the oven does not heat during the first 90 seconds). 
Substituting AT = T(t+ 1)—T(#) and AM = M(t + 1)- M(0), the changes in temperature over the course of | minute, 
we have starting at 1.5 minutes, 


| M(t +1) ‘ 


T(t +1) 


_| M® 
“| Ta 


Pa 
Te (M(d) - T(t) | (7.4.1) 


Given that the brownies began at 66°F and the oven began at 72°F, we have 


m(1.5) |_| 72 

T(1.5) |~| 66 |’ 

m(2.5) |_| M(L5) 21 _[ 2 21 _[ 9 
7(2.5) |~ | Ta.5) }~| -.cma.s)- 7.5) || 66 | *| 4072-66) |~ |} 66.041 |” 
M(3.5) |_| M(2.5) 21 _f[ 14 

7(3.5) |~ | 72.5) }~| -.(m2.5)- 7(2.5)) | >| 66.226 |7°"" 


and more succinctly, 


i= 


M(1.5) M(2.5) M(3.5) 
T(1.5) |’) T(2.5) |?| T@G.5) 


72 93 114 135 156 177 198 219 
66 |’| 66.041 |’| 66.226 |’| 66.553 |’| 67.022 || 67.631 |’| 68.380 || 69.268 |’°"" 
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bringing us to 8.5 minutes. At this point, M — T = 219 — 69.268 = 149.73, which exceeds 146. To continue, we need 
to start using AT = —13.516 + .099356(M — T), or T(t + 1) = T() — 13.516 + .099356(M (4) — T(2)) for the change in 
brownie temperature. In other words, we now have 


M(t+1) |_| M@ 21 
| TIt+1) -| T(t) |* | -13.516 + .099356(M(t) — T(t) r (42) 
So 
MQ.5)|_{ 219 |, 21 [240 
T(9.5) | | 69.268 13.516 + .099356(149.73) |~ | 70.629 
and so on, 


M(10.5) M(11.5) M(12.5) 7 
T(10.5) |’| 711.5) |’} 712.5) J? 


261 282 303 324 345 
73.941 |’| 79.010 || 85.662 |’| 93.740 |’} 103.10 |’?*"° 


at which point we reach another milestone. The oven temperature does not jump another 21°F at this point. It will 
only increase another 5°F, so is essentially up to working temperature. The first 14.5 minutes of baking brings the 
oven to ~ 350°F and the brownies to ~ 103°F, a state that the brownies baking in a preheated oven reached in about 
2.5 minutes. To summarize, brownies in a cold oven took 14.5 minutes to get to the same point (oven temperature 
350°F, brownie temperature 103°F) the brownies reached in a preheated oven in only 2.5 minutes. From here out 
it is safe to assume the baking will proceed similarly. Therefore it takes 12 more minutes to bake brownies starting 
in a cold oven than it does starting in a preheated oven. We simply add 12 minutes to the baking time and proceed. 
Presumably this applies to any baking done at 350°F. The first 14.5 minutes of baking starting with a cold oven are 
equivalent to only 2.5 minutes of baking starting with a preheated oven. 


As fascinating as the brownie heating experiment may be, this is neither an engineering nor math modeling class. 
Not to worry, the reader will not be asked to create their own models. Instead, focus on the results of the modeling 
process, equations (7.4.1) and (7.4.2). These are discrete dynamical systems. The talk about brownies and ovens has 
hopefuly grabbed your attention and motivated study, nothing more. In case not even that, I should mention discrete 
dynamical systems are used to model phenomena in biology, medicine, physics, economics, engineering, and a host 
of other areas. Chances are, if you are studying linear algebra, discrete dynamical systems appear in your field of 
study. 


A first order discrete dynamical system is an equation 


Xe+1 = f(x), (7.4.3) 

which paired with an initial condition 
xX) =V (7.4.4) 
defines a sequence Xo, X,X2,.... (7.4.3) is an example of a recurrence or recurrence relation and determines all 


of the terms of the sequence except the first, which must be supplied separately. For purpose of our study of linear 
algebra, x; are in R” andf : R” > R’ is an arbitrary function. 


260 CHAPTER 7. FURTHER APPLICATIONS 


M(k) 


Tk) | from which it follows 


Equation (7.4.1) can be rewritten in terms of this definition by setting x, = | 


f(x,) = T(k) | | pg (M(k) - T(k)) | 
M(k) 1 0 O M(k) 21 
T(k) |* 146| 1 -1 |] To |*| 0 


1 0 Mik) |, 1 : 0 M(k) | [21 
0 1 T(k) _ -1 T(k) 0 


_([1 0], 4 M(k) 21 
“1 @ 41° 746 a: 0 
1 [ 146 Mk) 
146] 1 a T(k) 
_ 1 [146 0 21 
T46| 1 145 |***] 0 


This is an example of a nonlinear discrete dynamical system as the function f is not a linear transformation. In this 
case, the function f is affine, and we would therefore say the system is affine. It takes the form 


X41 = Mx, +b. (7.4.5) 


As noted in the definition, once an initial condition is provided, a discrete dynamical system determines a se- 
quence. Our first order of business is to understand how so. For each initial condition, the sequence defined by a 
discrete dynamical system can be calculated term-by-term. The recurrence relation defines each term after the first. 
For example, the first few terms of the sequence defined by the system 


-—15 -1.3 -1.95 2 
Xe1 =] -55 18 3.15 |x,t+] —-5 (7.4.6) 
AS -1.3 -2.25 4 


with initial condition 


can be calculated as follows. According to (7.4.6), 


-—.15 -1.3 -1.95 2 —15 -1.3 —-1.95 8 
—S |=] -55 18 3.15 —5 


x)=] -55 1.8 3.15 [xo + + 
AS  -1.3 -2.25 4 AS -1.3 -2.25 6 
Also according to (7.4.6), 
-.15 -1.3 -1.95 2 -15 -13 -1.95 -4.4 2 5.52 
X=] -55 1.8 3.15 |x,;+] -5 |=] -55 1.8 3.15 a) +) -5 |=] -7.35 
A5  -1.3 -2.25 4 AS -1.3 -2.25 -1.8 4 6.74 
Similarly, 
-.15 -1.3 -1.95 a.o2 2 —2.416 
x3=|{ -55 1.8 3.15 -7.35 -5 |=] -.035 
AS  -1.3 -2.25 6.74 4 —.782 
and 
-15 -1.3 -1.95 —2.416 2 3.933 
x,=|] -55 1.8 3.15 -.035 }+] -5 —6.198 
AS -1.3 -2.25 —.782 4 5.443 
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accurate to 3 decimal places. The first five terms of the sequence are Xo, X;, X2, X3, X4, which have just been calculated 


(using this SageCell) as 
8 —4.4 5.52 —2.416 3.933 
—-5 },J 5 |,} -7.35 ],} -.035 |,} —6.198 |. 


6 -1.8 6.74 —.782 5.443 


Further terms can be calculated similarly. The process of calculating the terms is called iteration. The terms them- 
selves are called iterates or iterations, and the sequence is called the orbit of xo. 
Given the dynamical system 


Xk41 = Xx + 


146| 1 145 


1 146 0 
0 


z} | (7.4.7) 


from the brownie baking model, can you find the first 5 iterates in the orbit of 


72 
= ? 
Xo 66 | fs 
Answer on page 267. 
As a final exercise in iteration, the first 5 iterates in the orbit of x9 = for the dynamical system 
V6+V2  — V6+ v2 =) 
4 4 


are (approximately) 
1 —1.293 
| 1 |, 3.224 | 
Can you verify these terms (using SageMath)? Answer on page 267. 
The first few iterates of an orbit are often not the ultimate goal, however. For many applications, the point of 
interest is the long run. How can the 1000” through 2000” or 1,000, 000” through 1,000, 012 iterations of an orbit 


be described in general terms? Such a desctiption is called the system’s long term behavior. 
In the case of (7.4.6), 


-1.146 1.108 1.111 1.111 
xs; =| —-1.174 |, x39 =] —3.416 |, x55 =| —3.419 |, and xg =] —3.419 


0.401 2.647 2.650 2.650 


—4.083 —7.182 —10.376 
4.780 |’| 5.560 |’) 5.512 


accurate to 3 decimal places. It takes a short while, but the terms reveal a pattern. All terms x;, k > 55 are, accurate 
1.111 

to 3 decimal places, equal to | —3.419 
2.650 

a dynamical system settle down this way for all initial values in some neighborhood, we say that the vector it is 

settling on is an attractor. It “pulls” orbits toward it. But what vector is the attractor, and can we predict it without 

computing large numbers of iterates? By definition, x,4; = f(x;,), so when a dynamical system has an attractor, it 

means X;4; = f(x,) ~ x; and the approximation improves as k increases. Therein lies the answer to the mystery. 

The iterates are getting closer and closer to satisfying the equation 


. The iterates after the 54” do not change much. When the iterates of 


f(x) =x. (7.4.9) 


Any solution of this equation is called a fixed point of f, and if x; were such a value, we would have x; = f(x,) = Xx. 
The sequence would be fixed forever more at the value x. 
For example, a fixed point of (7.4.6) satisfies 


-55 18 3.15 
AS -1.3 -2.25 


-—15 -1.3 -1.95 
x+ 
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an equation we can solve: 


-—15 -1.3 -1.95 2 
-55 18 3.15 |x-x=-] -5 


JS =13 =2:25 
-—15 -1.3 -1.95 

-55 18 3.15 |}-J]/x=-] -5 ]. 
AS  -1.30 -2.25 4 


By row reduction (using this SageCell), it turns out 


1 130 1111111111 
x= —400 | ~]| —3.418803419 |. 
310 2.649572650 


It seems clear enough the orbit is approaching the fixed point, so we have that 


—3.418803419 
2.649572650 


1111111111 | 


is an attractor of (7.4.6). 


Not all orbits of discrete dynamical systems approach a fixed point, however. The dynamical system 


_ A 146 0 “ 21 
Mel Fag| 1 145 |") 0 
with initial condition 
oe 72 
M66 


from the brownie baking model does not. This fact is clear by observing the behavior of the first entry of x;. It simply 
increases by 21 with each iteration. To be precise, (x,),,; = 72 + 21k, which tends to infinity as k grows. Therefore, 
||xg|| diverges to co and we say the orbit tends toward infinity. 


Finally, the orbit of x9 = for the dynamical system 


1 
1 


V6+v2  ~V6+ v2 
ae 4 4 
K+] | v6- v2 V6+ v2 
q 


—2 

2 

does not present a clear pattern even after 100 iterations. As computed in this SageCell, x9g through xjo9 are (again 
accurate to three decimal places), 


—4.083 —7.182 —10.376 
4.780 |’} 5.560 |’} 5.512 |" 


Though it is likely not at all clear from this short list nor the SageMath output, the orbit exhibits a very simple pattern. 
To see it, a graph of the first 100 iterations: 


7.4. DISCRETE DYNAMICAL SYSTEMS [??, ??] 263 


. . XQ 
. . 54 
X1 | 
0 = X24 = X4g = X72 = X96 
—20 -15 -10 5 
. -54 e 


-10 4 


It appears there are only 24 iterations (points on the graph), but that is because they repeat. As indicated, xg = X24 = 
X4g =--+-. Similarly, x} = X05 = X4g = --: and soon. As a result, xog coincides with x2, X99 coincides with x3, and 
X100 coincides with x4. When a sequence of iterates repeats this way, we say the orbit is periodic. A sequence of 
iterates that approaches such a repeating sequence is called asymptotically periodic. 

As with the power method (section 6.2) and Markov chains (section 7.2), both of which can be framed as discrete 
dynamical systems, eigenvalues tell the story of long term behavior. For the example systems of this section, each of 
the form (7.4.5), eigenvalues of M and its spectral radius are listed in the chart. 


System Eigenvalues of @ Spectral Radius Long term behavior 


(7.4.6) —.8, —.3,.5 8 approaches fixed point 
(7.4.7) 1, 43 1 tends toward infinity 
G48) SE pe 1 periodic 


The spectral radius of a square matrix is the maximum of the magnitudes (absolute values) of its eigenvalues. The 
magnitude of a complex number a + ib is Va? + b?, so 


+ VE , Nb- 2, ee fy 5 (38- 2 


_ 6+2VI2+2 _6-2VI2+2 
~ 16 16 

— f642+64+2 

~\ 16 


= 1. 


Much like a geometric series, which converges when the ratio between consecutive terms is less than one and 
diverges when the ratio is greater than one, an affine system will approach the fixed point when the spectral radius 
is less than one and will tend toward infinity when the spectral radius is greater than one. The analogy ends there 
however. An affine dynamical system whose matrix has spectral radius one can exhibit several different behaviors: 
tendency toward infinity and periodicity as seen above, but also convergence to a fixed point, depending on the system 
and the initial condition. 
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Crumpet 34: Long Term Behavior 


For an affine discrete dynamical system, x;,; = Mx, +b, where | is not an eigenvalue of M, the system has a unique 
fixed point, x*: 


Mx*+b=x" 
Mx* -x* =-b 
(M —Dx* =—-b 


x =-(M-T)'b. 


The inverse of M — J exists because | is not an eigenvalue of M. Letting y = x — x*, which implies x = y + x", and 
substituting into x,,; = Mx, +b: 


Yar tx =M(y,+x')+b 
= My, + (Mx* +b) 


= My, +x’. 
So yrr1 = My,. The analysis of this linear dynamical system fully informs the behavior of the affine system. 
Assuming M is diagonalizable, we set yo = Xo — x* (by substitution) and write yo in terms of a basis of eigenvectors, 
{W,, W2,...,W,} corresponding to eigenvalues 2), A2,...,A,. Then 


Yo = C1Wi + CoW2 + +++ + CyWp 


for some scalars C,,C2,...,Cn, and yy = M (cy wy + coW2 +°°* + CyWn) = CyALWy + C2AQW2 + °°+ + CpAnWns Yo = 
M (cia + C2A2W2 + +++ + CrAnWn) = CLA; Wy + C205 Wo + +++ + Crd? Wn, and so on: 


k k k 
yi = caw SF C20, Wo qr eooge Cnd,Wn- 


This solution is dominated by the nonzero term(s) with the eigenvalue(s) of greatest magnitude. If the dominant 
magnitude is less than one, y, will tend toward zero and therefore x, will tend toward x*, the fixed point. If the 
dominant magnitude is greater than one, y;, will tend toward infinity, in which case x, will tend toward infinity. 


Key Concepts 

first order discrete dynamical system an equation of the form x;,., = f(x;). 

nonlinear discrete dynamical system a discrete dynamical system whose recurrence is a nonlinear transformation. 
initial condition a value v for the first term of a dynamical system, usually given as xo = V. 

recurrence the type of equation appearing in a discrete dynamical system. 

recurrence relation a recurrence. 

affine discrete dynamical system a dynamical system of the form x,4; = Mx, + b. 

affine transformation a transformation f : R” — R” where f(x) = Mx + b. 

iteration the process of calculating the terms of the sequence determined by a discrete dynamical system. 
iterates the terms of the sequence determined by a discrete dynamical system. 

iterations iterates. 


orbit the sequence determined by a discrete dynamical system—the solution of a dynamical system with initial 
condition. 
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long term behavior a quantitative or qualitative description of the tail end of the orbit of a dynamical system. The 


99 66, 


phrases “approaching the fixed point’, “tending toward infinity”, and “asymptotically periodic” are often used. 
Another possibility for nonlinear dynamical systems is “chaotic”. 


fixed point a solution of the equation f(x) = x. 
attractor the fixed point of a dynamical system whose solutions tend toward it. 
repeller the fixed point of a dynamical system whose solutions tend away from it. 


spectral radius the greatest magnitude of the eigenvalues of a matrix. 


Exercises -10 -69 5 Bis 
(o) f(x) =-= 0 54 O Ix+{ 80 
1. Calculate the first four iterates of the dynamical system 6 2 69 —-13 583 
defined by x;,41 = f(x;,) and xp = 0. : 
2. Find the fixed point(s) of the dynamical system in ques- 
(a) £0) = 5 a |x é | i | tion 1. [S]-341 [A]-366 
[S]-340 3. Find the eigenvalues of the square matrix in question 
ai ae 1. [A]-366 
(b) f(x) = 3 1 31% +] 13 | 4. Based on the information from questions 1-3, does the 
i : dynamical system have an attractor? [S|-341 [A]-366 
-7 -10 2 
(c) f(x) = 3 | 4 g Ie + 3 | 5. Describe the long term behavior of the dynamical sys- 
: tem in question 1. Calculate more iterates if needed. [S|- 
1] - 2 A 
teem" > eels 341 [A]-366 
ase 3 2 i ee 
6. Picturing an attractor, part 1. Let M = rl 3 4 6 | 
I L 4 : = : 
ie 40 —36 3/5 50 
20| 63 -5S6 -7/10 The fixed point of 
[A]-365 d 
1/3 -4 -§ Xk. = Mx, + > Xo =V 
ae : 6 
(f) f(x) sl ¢ _7 {Xt _1 | 
3 
(g) £(x) = Al . : og ie | is . | and the eigenpairs of M are >| = | and 
25) 2: 
3 1 2 =| } [A]-366 
— 1 
(h) £0x) E 3 [Xt a 5 
11 10. «OO 54 (a) Verify that the spectral radius of M is less than 1 
() foo =| -8 5-12 |x+] -27 (and therefore the fixed point is an attractor). 
aR <4. -=3 19 (b) Ona set of axes, plot the fixed point with the eigen- 
155 6A 55 3/5 vectors emanating from it. 
i 7 : (c) Pick several random points approximately 10 units 
(j) £(x) = Te -192 -76 72 |x+] 3/8 Ponte ap y : 
147 64 —47 “7/10 away from the fixed point. 
2 alculate the first 4 iterations of the orbits of eac 
[A]-365 (d) Calculate the first 4 i i f the orbits of each 
1 4 -16 8 2% point from part (c) and plot them on the same set 
(k) f(x) = | 2 -22 9 fe 5 | of axes. 
4 -20 6 3 (e) Connect each orbit with a single smooth arrow 
f 82 57 —29 5/6 through its points in the order in which they occur. 
dd) f&)=—} -16 -—3) 41 |x4+] 7/3 v2, 1-1 
18 8 3 65 1/18 7. Picturing an attractor, part 2. Let M = mae | 14 } 
1 ~26 8 10 3 The fixed point of 
(m) t= 5 -9 -1 4 I rr | : 
-15 3 2 7 Xia. = Mx, + saya [Pe 
1 153 500 ~=—-100 1/7 
(n) f(x) = —]| -48 -164 -37 |x+] 2/7 _ | 3 ; 2 F 
28 40 160 53 3/7 1s 3 and the eigenvalues of M are (+ 1). The 


[A]-365 eigenvectors are thus complex as well. 
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(a) Verify that the spectral radius of M is less than 1 
(and therefore the fixed point is an attractor). 


(b) Ona set of axes, plot the fixed point. 


(c) Pick several random points approximately 8 units 
away from the fixed point. 


(d) Calculate the first 4 iterations of the orbits of each 
point from part (c) and plot them on the same set 
of axes. 


(e) Connect the orbits by drawing a smooth arrow 
through them in the order in which they occur. 


ei Saat ; _ 1] 24 -8 
8. Picturing a repeller, part 1. Let M = dl 3 26 } 


The fixed point of 


1 
Xiu1 = Mx, + 3 


is Es | and the eigenpairs of M are 2| : | and 


1? 


(a) Verify that both eigenvalues of M have magnitude 
greater than 1 (and therefore the fixed point is a 
repeller). 


(b) Onaset of axes, plot the fixed point with the eigen- 
vectors emanating from it. 


(c) Pick several random points about 1 unit from the 
fixed point. 


(d) Calculate the first 4 iterations of the orbits of each 
point from part (c) and plot them on the same set 
of axes. 


(e) Connect each orbit with a single smooth arrow 
through its points in the order in which they occur. 


v2{ v3 -1 
9. Picturi Her, part 2. Let M = — 
icturing a repeller, par' e | 1 3 


2 
The fixed point of 


Xin = Mx + 


3 v2 ; Xo = V 
6306 | 


is : | and the eigenvalues of M are a V3 +i). The 


eigenvectors are thus complex as well. 
[A]-366 


(a) Verify that both eigenvalues have magnitude 
greater than 1 (and therefore the fixed point is a 
repeller). 


(b) Ona set of axes, plot the fixed point. 


(c) Pick several random points approximately 1 unit 
away from the fixed point. 


(d) Calculate the first 4 iterations of the orbits of each 
point from part (c) and plot them on the same set 
of axes. 


7See Discrete Dynamical Systems: With Applications in Biology. 


(e) Connect the orbits by drawing a smooth arrow 
through them in the order in which they occur. 


10. Let M—TJ be an invertible matrix and suppose some fixed 
point of the affine dynamical system 


Xi = Mx, +¢; X) =V 
is an attractor for any initial value v. Argue that 


(a) the fixed point is unique; and 


(b) the fixed point of the dynamical system 
Zn+1 = M'(2e — ©); Zp = V 
is unique and is a repeller. 


11. Let M—J be an invertible matrix and suppose some fixed 
point of the affine dynamical system 


Xiu. = Mx, +¢; X) = V 
is a repeller for any initial value v. Argue that 


(a) the fixed point is unique; and 


(b) the fixed point of the dynamical system 
Ziv = Mae — €); Zp = V 


is unique and is an attractor. 


1 = 
12. Picturing a saddle point. Let M = P3 ‘ : } The 
fixed point of 
1} -13 
te 1 |ix0=y 
4 
is - | and the eigenpairs of M are + - | and 


[1 


(a) Verify that one eigenvalue has magnitude greater 
than | while the other has magnitude less than 1 
(and therefore the fixed point is neither an attractor 
nor repeller). 


(b) Onaset of axes, plot the fixed point with the eigen- 
vectors emanating from it. 


(c) Pick several random points closer to the line 
defined by the eigenvector whose corresponding 
eigenvalue has magnitude greater than | than they 
are to the line defined by the other eigenvector. 


(d) Calculate the first 4 iterations of the orbits of each 
point from part (c) and plot them on the same set 
of axes. 


(e) Connect each orbit with a single smooth arrow 
through its points in the order in which they occur. 
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13. Host Parasite Interaction. Let H be the population of a 
host prone to parasite population P and suppose the pop- 
ulations evolve according to the discrete dynamical sys- 
tem 

Ans = yHne 
Pr = Hy (1 - en) (7.4.10) 
for some positive values of the parameters y and a.’ 


(a) Find the fixed point, 7 } of (7.4.10). 


(b) For 0 < y < 1, argue that the fixed point of 


(d) The linear dynamical system 


A Iny “~ 
R a 1 In a (7.4.11) 
P k+1 a 1 P k 
H H-H* 
where p |= Pp | called a lineariza- 


tion of (7.4.10), is an excellent approximation of 
(7.4.10) near its fixed point. For y > 1, argue that 
the fixed point of (7.4.11) is a repeller (and there- 
fore is an unstable state for the physical system— 
populations will tend away from the fixed point). 
This is enough to show that the same happens for 


(7.4.10) has a negative coordinate (and therefore 
is an unattainable state for the physical system— 
populations cannot be negative). (e 


(7.4.10). 


VS 


(c) Fory = 1, argue that the parasite population is zero 


according to (7.4.10). each)? Explain. 


Answers 
_ 72 

brownie iterates Given that xp = 66 , accurate to three decimal places, 
x 1 [ 146 0 72|,[ 21]_[ 93 
'~746| 1 145 || 66 0 | | 66.041 
x 1 [ 146 0 93 |,[21]_[ 144 
>= 7146} 1 145 || 66.041 0 | | 66.226 
: 1 [ 146 0 114 ],[ 21 ]_[ 135 
3 7146! 1 145 || 66.226 0 | | 66.553 
—_ 1 [ 146 0 135 ],[ 21 |_| 156 
+= 746| 1 145 |} 66.553 0 |~ | 67.022 


so the first five iterates of the orbit are 
72 93 114 135 156 
66 |’} 66.041 |’] 66.226 |’}| 66.553 |’| 67.022 |° 


last iteration example Sample SageMath code that can be copied and pasted into a SageCell: 


M=1/4*matrix(2,2,[sqrt(6.0)+sqrt(2.0), sqrt(2.0)-sqrt(6.0), 
sqrt(6.0)-sqrt(2.0), sqrt(6.0)+sqrt(2.0)]) 
b=vector([-2.0,2.0]) 
x0=vector([1,1]) 
print¢("0 :",x0) 
for i in range(1,5): 
x0=M*x0+b 
print(i,":",x0) 


Is this a good model for host/parasite populations 
that can live in equilibrium (constant populations 
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7.5 Rep-tiles [6.3] 


The floors of kitchens, bathrooms, museums, and other spaces are tiled more often than not. Flat ceramic or natural 
stone tiles are placed together in a nonoverlapping way, covering the whole floor. It is very common to see square 
tiles laid out in a grid, for example. Squares are easy to fit together this way and the pattern can be extended to cover 
any amount of space. 

The plane R? can be imagined as a floor without boundaries. Covering it with tiles requires extending those tiles 
endlessly in every direction. The most familiar example is a boundless rectangular grid. Imagine rectangular graph 
paper extended forever in every direction. It may not be the most attractive covering of the plane, but it does the job. 

Any set of shapes covering the plane without overlapping is called a tessellation, and the shapes are said to 
tessellate the plane. Like squares, equilateral triangles and regular hexagons can be fitted together to tessellate the 
plane in a simple pattern. In fact these three shapes form the bases for the only three so-called regular tessellations, 
portions of which are shown here. Only your imagination can extend the patterns indefinitely. 


He OS 


Polygons that are not regular tessellate the plane just as well. Parallelograms, hexagons, dodecagons, convex and 
concave, can all be fitted to tessellate the plane. Portions of a small sample are shown here. 


ate] BB bs 


M.C. Escher famously made tessellation an artform. Tilings with irregularly shaped birds, fish, human figures, 
and other natural shapes appear in many of his most famous creations. See figure 7.5.1, for example. One way to 
create Escher-esque tessellations is to start with a regular tile and modify its perimeter in a symmetric way. The 
third tessellation of the diagram above is created from squares where each side is replaced by a zig-zag, for example. 
Among the infinite possibilities for tiles created this way are the two shown here. 


fe VA 


Each edge of the regular tile is replaced by a curve with 180 degree rotational symmetry about the midpoint of the 
original edge. As long as the replacements do not intersect one another, the resulting shape is a tile. That is, multiple 
copies can be fitted together to cover, or tessellate, the plane. 
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Certain tiles (shapes that tessellate the plane) are 
actually doubly tiling. Not only can they be fitted to- Figure 7.5.1: M.C. Escher’s System X(e) 
gether to tile the plane—they can also be fitted together : = 
to tile larger copies of themselves! These shapes are 
called rep-tiles, short for self-replicating tiles. Again, 
the square provides an immediate example. Four con- 
gruent squares fitted together at a corner, sides parallel 
to one another form a square with side length twice the 
original (and four times the area). Equilateral triangles, 
and in fact all triangles, are rep-tiles. Four congruent 
copies can be pieced together, three in the same ori- 
entation and the fourth rotated 180 degrees, to form a 
larger copy. Regular hexagons are not rep-tiles as no fi- 
nite number of copies of a hexagon can be fitted together \ 
(as tiles, without overlap) to form a hexagon. However, AES ~ 
there are many non-regular rep-tilian hexagons. For ex- 21 The M.C. Escher 
ample, the hexagon formed by gluing three squares to- Company - the Netherlands. All rights reserved. Used by 
gether in an ell is a rep-tile. The following diagram permission. www.mcescher.com 
demonstrates the self-replication of a square, a regular 
triangle, and this hexagonal ell shape. Four copies of each shape are fitted together to form an enlarged replica. Since 
we have already seen that these shapes tessellate the plane, they are indeed rep-tiles. 


Rep-tiles come in much more fanciful shapes, however. Take these three, for example. 


Appearing rom left to right are the carpenter’s plane of Golomb[1!0], a fractile of Bandt[2], and a twindragon. Seeing 
that these shapes are in fact rep-tiles is nontrivial. Pictures showing them tiling replicas of themselves and tessellating 
a portion of the plane would make adequate demonstrations, but would give no insight into their origin or how to 
imagine others, or even how to define the shapes themselves. For that, we rely on linear algebra. 


After building a replica of a rep-tile (from similar copies of the rep-tile itself), one can switch perspecives and 
look at the completed figure as a dissection of the larger rep-tile. From this viewpoint, rep-tiles are plane figures 
that tessellate the plane and can be dissected into finitely many similar copies of themselves. All rep-tiles can be 
seen from this vantage. Refer back to the diagram of the square, the equlateral triangle and the hexagonal ell being 
fitted together to self-replicate with a different lens. The square is shown dissected into four smaller squares. The 
equilateral triangle is shown dissected into four smaller equilateral triangles, and the ell shaped hexagon is likewise 
divided into four smaller copies of itself. 
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Crumpet 35: Solomon Golomb 


Solomon Golomb is credited with coining the term rep-tile, but his original paper[10] only uses the term “rep-k’, 
short for replicating of order k. To quote Golomb, a plane figure is called rep-k if “it can be dissected into k ‘replicas’, 
each congruent to the others and similar to the original”. Curiously, Martin Gardner[7] credits Golomb with laying 
the foundation for the study of rep-tiles and inventing the term in a series of private papers, all in his article appearing 
more than a year earlier than Golomb’s! 


By imposing a set of axes on any of these figures, rigorous mathematical descriptions of the figures become 
available. Any placement of the axes will do. Only a frame of reference is needed. For example, suppose we arrange 
for opposite corners of the square to coincide with (0, 0) and (2, 2). The following diagram shows the four squares of 
its dissection, and for each of these squares, an affine transformation mapping the whole square to the part. 


In words, the 2 x 2 matrices scale shapes (and more to the point, the square) by a factor of 5 in both the horizontal 
and vertical directions. The addition of 2 x 1 vectors provide translations. T4 can thus be described as scaling by 
5 horizontally and vertically followed by translation 1 unit horizontally. T4 | ; = : | + : | = : 

2 1 1 2 : ss ee : 

T4 Pa | teal a ae ed for example. Again, the critical point is that the image of the 2 by 2 square 
under 74 is the purple square: contracting the 2 by 2 square by a factor of 5 and then translating the contracted 
copy right 1 unit lands the image of the larger square (well, squarely) on top of the purple square—the bottom 
right square of the dissection. Letting S be the 2 by 2 square with opposite corners at (0,0) and (2,2), we thereby 
have T4(S) = the purple square. Similarly, T,;(S) = the orange square (the bottom left square of the dissection); 
T>(S) = the blue square (the top left square of the dissection); and 73(S ) = the green square (the top right square of 
the dissection). 

The union of the four images, T(S'), T2(S), T3(S), and T4(S ), is the original square. In the form of an equation, 


T\(S)U To(S) VU T3(S) U T4(S) = S. (7:54) 


and 


A theorem of Hutchinson[ 14] asserts that S is the only compact set that satisfies (7.5.1). In other words, the square 
S is determined by the transformations T,,7>, 73,74 via equation (7.5.1). This way, these four transformations 
provide a precise description, or definition, of the square with opposite corners at (0,0) and (2,2). Incidentally, 
each transformation T; is a similitude—a rigid transformation (rotation, reflection, translation, or composition 
thereof) composed with dilation (scaling by the same scale factor in all directions). Similitudes preserve shape but 
not necessarily size, exactly the type of transformation needed to map a shape onto one of the (similar) parts of its 
dissection. 
Let @ = {C),C2,...,C,} be a set of similitudes in R” with scale factors less than one, and define 


Hee(A) = Ci(A) U C2(A) U--» UC, (A) (7.5.2) 
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for any subset A of R”. Hutchinson’s theorem concludes that there is exactly one compact set K in R” such that 
He(K) = K. Moreover, 


lim HE (A) = K 53) 


for any compact set A. Not only does the theorem assert the existence and uniqueness of the set K, it gives a way to 
construct it from the similitudes. 


Crumpet 36: Hutchinson 


The original theorem of Hutchinson and its proof lie along the fence between real analysis and topology. Let X = 
(X, d) be a complete metric space and S = {S,,...,S y} be a finite set of contraction maps on X. Then there exists 
a unique closed bounded set K such that K = (J, S;K. Furthermore, K is compact and is the closure of the set of 
fixed points §;,..;, of finite compositions S ;, © ...° S;, of members of /. 


For arbitary A C X let (A) = (ORs SjA, S?(A) = HSH? "(A)). Then for closed bounded A, (A) > K in 
the Hausdorff metric. 


Applying this theorem to the set Y = {T1,T2,T3, 74}, we do not have to know anything about the origin of the 
transformations T;. All the work of dissecting the square, placing it in the plane, and deriving the transformations in 
ZF can be forgotten. All we need is a compact set A (and a lot of patience!) to recover the square. It is the limit of the 
sequence H 7(A),Hz (Hz (A)), Hz (He (Hz(A))),..., the iteration of Hz on any compact set A. The first few 
terms of this sequence are shown below, where A takes the shape of a kitty®. 


A H(A) Hz (Hz(A)) Hz Hz Hez(A))) 


‘<< 


— 


1 9 1 9g i 

L(x) = 6 L x tate) =| 3 1 x+ i 
_fo -4 [0 4] fo 
tata) =| 4 x+ | tae =| 0 x+| , 


Can you generate the ell shaped rep-tile by applying (7.5.3) to @ = {L,, Ly, L3, L4} (and some set A of your own 
creation)? Answer on page 277. 

The following diagram illustrates three things. One, the dissection of a rep-tile is not unique (two different 
dissections are shown for the same right triangle). Two, the number of parts in a dissection of a rep-tile is not always 
four. Three, the parts of a dissection need not be congruent to one another (they must only be similar to the whole). 


8Kitty image downloaded from https://openclipart.org/detail/292277/cute-cat. 
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& 
V5 
0.5 
0 05 1 15 2 0 05 1 15 2 


Can you find the similitudes associated with these dissections (note there will be four similitudes associated with the 
first dissection and two similitudes associated with the second)? Answer on page 277. 


2 


2 2 
In the second dissection, the two scale factors are Fi and ae Not coincidentally (-,) + (4) = |. In general, 


if @ = {T),T2,..., Tp} is the set of similitudes that determine a rep-tile R, then 
T\(R) U T2(R) U---UT,(R)=R 

and the images 7;.(R) pariwise have no overlapping area, so 

area (T(R)) + area (T>(R)) +--+ + area (7,(R)) = area(R). (7.5.4) 
Now, we know from section 6.3 that area(7T;,(R)) = | det Z;| - area(R) where L; is the matrix of the linear part of 
similitude T;,. We also know from section 3.7 that the determinant of a product is the product of the determinants and 
from section 3.5 that multiplying a 2 x 2 matrix by scalar c multiplies its determinant by c?. Finally, combined with 
the facts that the determinants of reflections and rotations are —1 and 1 respectively, the determinant of any matrix of 
a similitude is s* where s is its scale factor. Applying this information to equation (7.5.4), we get 

s;area(R) + sjarea(R) teeet sarea(R) = area(R) 
where s, is the scale factor of similitude T,. Hence 
spt spteo tsi. (7.5.5) 


The square, the hexagonal ell, and the triangle were all dissected into four parts, each of which was a 5 scale 


; 2 Pee 2 Fae ope 
replica of the whole. By equation (7.5.5) it must be that (5) + (5) + (5) + (3) = 1, an equality that is not hard to 


verify. As a matter of vocabulary, the set of similitudes associated with a rep-tile is an iterated function system, or 
IFS. Hence, if 51, 52,..., 5, are the scale factors of the similitudes of the IFS of a rep-tile, then i + i teeet - =1. 

Returning to the carpenter’s plane of Golomb, the fractile of Bandt, and the twindragon, shown below are dissec- 
tions. 


Imposing a set of axes on any one of the dissections allows developing the similitudes mapping the shape to its parts. 
Much like a center and radius define a circle or two points define a line, the collection of these similitudes defines the 
rep-tile. 
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Key Concepts 


rep-tile a plane figure that tessellates the plane and can be dissected into finitely many similar copies. Equivalently, 
a plane figure that tiles the plane and tiles an enlarged replica of itself. 


Hutchinson’s theorem (a special case) Let @ = {C),C2,...,C,} be a set of similitudes in R? with scale factors less 
than one, and define 


Hee (A) = Ci (A) U C2(A) U--- UC, (A) (7.5.6) 


for any subset A of R?. Then there is exactly one compact set K in R? such that H¢(K) = K. Moreover, 
lim H(A) = K (75:75 
for any compact set A. 
similitude a rigid transformation composed with a dilation. Similitudes preserve shape but not necessarily size 
rigid transformation a rotation, reflection, or translation. 
dilation a map of the form T(x) = rx for some real number r > 0—scaling by the same factor in all directions. 
compact set a subset S of R” is compact if it is closed and bounded. 


closed set a subset S of R” is closed if the limit of every convergent sequence of points in S' is also in S.. Alternatively, 
S is closed if it contains all of its limit points. 


bounded set a subset S of R” is bounded if there exists a real number M such that S C {xin R” : |x|] < M}. 
Alternatively, S is bounded if it is contained within some ball centered at the origin. 


iterated function system a set of contraction mappings. 


contraction mapping a map T : R” — R" is a contraction (mapping) if for every distinct pair of points x and y in 
R” there exists a real number s < 1 such that 


d(T(x), TY) — 
dx,y) 


A contraction mapping scales down the distance between every pair of distinct points. A similitude with scale 
factor less than one is a contraction mapping. 


IFS iterated function system. 


scale factors of the IFS of a rep-tile if 5), 52,...,5, are the scale factors of the similitudes of the IFS of a rep-tile, 
then s}+sj+-+-+55 = 1. 


Exercises 
(b) = 
3 
1. Determine an affine transformation that maps the large 
figure to the similar (smaller) figure. =y54 
t 
4 0 0 1 
(a) 
i [S]-341 
- () : |__| 
t => 
: i 0 0 1 
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(@) sf | 
=>—t 
0 0 3 4 
[A]-366 
(e) —4 (2, 3V3) 4 
3 
as 
4 4 q ; 
0 o| 7 
ay 
() 4 (2,2V3) © (2,2V3) 
a 
4 4 
0 o| 1 2 8 4 
* (2,43) 
my N 
=>-2 
, 
2 a 
Ph 
3 
=>-2 
, 
o| 4 $4 


[A]-366 


2. Build a larger copy of the figure from similar copies of it- 
self, thereby showing that it has one feature of a rep-tile. 
Appropriate sizes and number of copies can be found on 
page 278. 


(a) 


(b) [A]-366 


(c) 


(d) 


yy 


[A]-366 
(e) 


(f) [A]-366 


> 


[A]-366 


7 


G) 


- 
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3. Show that the shape in question 2 tessellates the plane. (c) 
This completes a demonstration that the shape is a rep- 
tile. [S]-342 [A]-367 

4. The 3-4-5 right triangle is a rep-tile that can be dissected 
into two parts as shown. 


(2, 2V3) 


Ny WO & 


4 


(a) Use similar triangles to calculate the scale factors (@) “N 
of the two similitudes, s; and s», of its IFS. 


(b) Verify that sj + 55 = 1. 


[S]-342 


5. Find a dissection of the 30-60-90 triangle into three con- 
gruent parts, each similar to the whole, showing that the 
triangle is a rep-tile. Equivalently, build a 30-60-90 tri- 
angle” out of three congruent 30-60-90 triangles. 

6. The right triangle with side lengths 1,2, and V5 can be 
dissected into five congruent parts, each similar to the 
whole, two different ways. Find one of them. [A]-367 

7. What are the scale factors of the IFS for the triangle in 
question 


(a) 5 
(b) 6 [A]-367 


(e) 


(4.5, 4.5V3) 


8. A rectangle can be dissected into three congruent parts, 
making it a rep-tile. Equivalently, three congruent copies 
of this rectangle can be fitted together to form a (larger) 
similar copy. What is the ratio of its side lengths? 


9. Find an IFS of the rep-tile suggested by the dissection. 
Impose your own set of axes where not supplied. 


we N WwW fF UV ans @ 


8 9 Jo 11] 


(f) The images of the line segment under the 
similtudes of the IFS are the line segments 
from (0,0) to (0,5) and from (5,0) to (5,5). 


(b) [A]-367 


0 [A]-367 


° A 30-60-90 triangle is one whose interior angles measure 30, 60, and 90 degrees. 
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(h) 


(k) 


[A]-367 


10nttps://Iqbrin.github.io/tea-time-linear/rep-tile-designer.html 


10. Check your answers for question 9 with the rep-tile de- 
signer.!° It will be helpful to have your similitudes writ- 
ten in terms of the designer’s format. Each similitude 
should be expressed as a composition of 


(i) areflection (across the x-axis, y-axis or neither) 
(ii) a scaling (scale factor) 
(iii) a rotation (in degrees about the origin) 
(iv) a horizontal translation 


(v) a vertical translation 


in that order. [S|-343 [A ]-367 


11. Find the three scale factors of the IFS suggested by the 
dissection. [S]-344 


2.6 


5 


12. Find an expression for c in terms of a and b. HINT: Write 
down five equations involving the three scale factors of 
the IFS suggested by the dissection. Four of them can 
be used to eliminate the scale factors, leaving a single 
equation with just a,b,c. Solve this equation for c. 


a 
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Answers 


generating the L-shape Letting A take the shape of a pumpkin'!, the L-shape appears rather plainly after only three 
iterations: 


A Hee (A) He (He¢(A)) He He He(A))) 
2 (3 xy BS 


P 
2 0 


ae 


2 


ze ® a este 
one detente 

0 1 2 0 

dissecting the triangle For the first dissection, the four transformations mapping the whole triangle to the four parts 
are, in words, 


1. scale by a factor of 4, 
2. scale by a factor of 5 and then translate 1 unit right, 
3. scale by a factor of 5 and then translate 5 unit up, and 


4. scale by a factor of 4, rotate (about the origin) by 180°, and then translate 5 unit up and one unit right. 


As affine transformations, the mappings are 


i 1 
xl 3 0 |x xh | 2 : x+| 1 : 
0 5 0 3 0 
L -l L 
xh] 2 r x+ 0 : xe] 2 | |x+] 2 |. 
Os 3 o = 1 


For the second dissection, remember all the triangles are similar, so corresponding parts are in proportion. 


In particular, the smallest triangle is a we scaled version of the whole and the remaining part is a 5 scaled 


version. Getting a little ahead of ourselves, transformations mapping the whole triangle to the two parts are, in 
words, 


1. scale by a factor of aa reflect about the y-axis, rotate by angle 6 (counterclockwise about the origin), 
then translate along line segment x; and 


2. scale by a factor of ae reflect about the x-axis, rotate by angle —@ (clockwise about the origin), then 
translate along line segment x. 


To quantify the rotations and translations, we rs to calculate x and the sines and cosines of 6 and 6. Using 
-\2 
Seca theorem, 17 = x? + (3) sox= on and the coordinates of P are (xcos 8, x sin). But cosB = i 
i a 
and sinf = a Finally cos 6 = a and sin @ = yR? 80 the mappings are 


inate cosB —sinB -1 0 re 2 ; 2 cos@ sind 1 0 4 2 
- V5| sinB  cosB i. ||* 2 oe v5| —sin@ cosé @ =r" 3 
which simplify as 
-lL _2 2 4  _2 2 
x>| 3 Pp |x 1 |: x>| 55 1 |x 1 |: 
5: 5 5 “5: 5 5 


'lPumpkin image downloaded from https://openclipart.org/detail/86665/plain-pumpkin. 


278 CHAPTER 7. FURTHER APPLICATIONS 


Solutions to Selected Exercises 


Section 1.1 


la: The number of rows always comes first in the size of a matrix, so the matrix has 15 rows. 
2b: The number of columns always comes second in the size of a matrix, so the matrix has 5 columns. 


3c: A matrix with size 4 x 14 has 4 rows with 14 entries each (equivalently it has 14 columns with 4 entries each) for 
a total of 4- 14 = 56 entries. 


4d: M3, means the entry of M in the third row and first column, so the answer is —2. 
5d: N.3 means the third column of N, so its size is 5 rows by 1 column or 5 x 1. 


Sh: M42 means the submatrix formed by deleting row 4 and column 2 of N, which means M\42 will have one fewer 
row and one fewer column than N making its size 4 x 5. 


6b: A¢,, means the sixth row of A so it is [ 2 -4 10 -7 -3 |. 


6f: Az.4,2:3 means the submatrix of A containing the intersection of rows two through four with columns 2 through 3, 


sO 
-11 10 

Az.423 =| —10 12 |. 
-1 3 


Section 1.2 


li: Scalar products can always be computed, and is done so entry-wise. Each entry is multiplied by the scalar: 
> -1 6 |_|} 2-1) 26) 
8 15 | 


2(8)  2(15) 
In: Since these matrices are not the same size, there are entries in one that have no corresponding entry in the other. 
Therefore the difference is not defined. It cannot be computed. 


6: In mathematics and logic a statement is either always true (true for all possible values of the variables) or it is 
false. Since there are matrices for which M — N # N — M (see counterexample below) the statement is false. 


| 


9: SageMath uses calculator notation to do arithmetic computation, so 3A+4T is input as 3*A+4*T. See £3) Sage ath Cell] 
123. The result is 
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[ 286 188 -89 -110 -118 -132] 
[-156 94 65 -30 -132 -58] 
[ -51 -224 -195 -184 -324 -104] 
[ 6 281 120 112 122 -38] 
[ 22 -5 -51 -49 155 179] 


Section 1.3 


1d: For matrix multiplication to be defined, the left matrix must have the same number of columns (in this example it 
-9 2 
has one) as the right matrix has rows (in this example it has three). Therefore the matrix product | —4 | | 9 | 
is undefined. 


1d: Row-column multiplication is the sum of the entry-by-entry products (first entry times first entry plus second 
entry times second entry plus third entry times third entry): 


6 
[ = 0-3 | 3 |=-16 40-2 40-969 
5 
=-6+0-15=-21. 


2d: Because a column matrix has exactly one element per row and a row matrix one element per column, every 
column-matrix-row-matrix product is defined. The left matrix has the same number of columns (one) as the 


right matrix has rows (also one). The i,j-entry of the product is the product of entry in the i” row of the 


left matrix with the entry in the j” column of the right matrix. For example, the 1, l-entry of the product is 
(6.3)(2.3) = 14.49 and the 2, l-entry is (4.1)(2.3) = 9.43. Placing the right matrix just to the right and below 


the left matrix can help with the organization: 


6.3 14.49 28.35 
4.1 9.43 18.45 


3.4 7.82 15.3 
| 22. 45°], 
14.49 28.35 
The answer is} 9.43 18.45 |. 
7.82 15.3 


2e: In matrix multiplication, the left matrix must have the same number of columns (must be as wide) as the right 
matrix has rows (is tall). The left matrix of this example has 3 columns while the right matrix has 4 rows, so 
the product is undefined. 


2f: The i, j-entry of a product is the product of the i” row of the left matrix with the ” column of the right matrix. 


Therefore the product will have as many rows as the left matrix and as many columns as the right matrix. In 
this example, that means 2 rows and | column. 


1 
1, 1-entry: [ —3 0 1 | 3 |=! 
4 


1 
2, 1-entry: [ 2: & “7 | 3 |= 
4 
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Placing the right matrix just to the right and below the left matrix can help keep this straight: 


P25 rl] .s| 


The answer s| 45 
3e: u’v = -74: 
-11 
uv= [ 10 2 -3 | 3 = 10(-11) + 2G) + (-3)(-10) = -110 + 6 + 30 = -74 
-10 
3g: vu = -74: 
10 
vu= [ -11 3 -10 | 2 |=(-11)10+ (3)2 + (-10)(-3) = -110 + 6 + 30 = -74 
-3 


14a: Matrices or vectors may be used in SageMath. 


print((u.transpose()*v)[0,0]) produces 
35150214 

print (vector(u)*vector(v)) produces 
35150214 


15: The transpose of a matrix is computed using the . transpose() method in SageMath. 


[OTR] print(Q.transpose()*R) produces 


[ -571 -2517 -378 -941 100 -1176] 
[-3588 -2891 -2430 -1283 -1422 -432] 
[-2838 -1795 -2412 -1092 -1456 162] 
[ -93 2587 1859 2531 1318 857] 
[-2980 -2369 -957 -1053 -250 -379] 
[ -660 1567 -678 708 -1417 514] 


[OR™| print (Q*R.transpose()) produces 


[-2563 1613 1516 -1620 -280] 
[ 1452 617 -5796 2035 1519] 
[ 2529 -1187 -1066 650 1886] 
[-1058 2668 575 -1211 85] 
[ 919 -140 -787 -221 1144] 


They are not equal. Note they are not even the same size. Q7 R is 6 x 6 while QR’ is 5 x 5. 
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Section 1.4 


If: |lul| is the magnitude or norm of u, which is defined as Vu‘u. In this case, 


2 
|u| = [ 2 6 12 | -6 |= 22 + (-6)2 + 122 = V44+ 36+ 144 = V184 
12 
2f: d(u, v) is the distance between u and v, which is defined as the norm of their difference, ||u — y||. In this case, 


sls] 


3f: The simplest way to check whether two vectors are orthogonal is to check whether their dot product is zero. In 
this case, 


= V824 42482 = V144 = 12 


Ju — vil = 


~6 
u'y=[2 -6 12 ] -10 |=2-0 01-10 12(-4) = -12 + 60 — 48 =0 
4 


Since the dot product is zero, the vectors are orthogonal. 


4c: In order for the vectors to be orthogonal their dot product must be zero. Setting the dot product equal to zero 
gives an equation that can be solved for k: 


[2-6 -3]] + |-0 


-10 
(—2)(-7) + (—6)k + (—3)(-10) = 0 
14- 6k + 30=0 
44 = 6k 
ae 
3 


so the solution is k = 2 


5c: The sum of vectors is the vector with tail coinciding with the tail of the first addend and head coinciding with the 
head of the last addend. In this case, the tail of the first addend is at (0, 0) and the head of the last addend is at 


(5,3), so the answer is the vector from (0, 0) to (5, 3): | ; 


6c: Following the hint: 


12 
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10b: 


The orange vector w is the one requested. Since it is being added to u, its tail is placed at the head of u, and 
since the distance of this sum from v is supposed to be | unit from v, the head of w must land one unit from the 
head of v. There are many answers (none of which are shown in the diagram), for example placing the head of 


w at (10, —5) we have 
_| 10 7 -6 
bis alg 


-5 
Solutions may also be found algebraically. Plugging values of u and v into the given equation will allow solving 
for w: 


16 
7 


d(u + w,v) = | 
lu + w—v|| = 1 


-6 W1 ae 
lo} il 
w,—- 15 es 
waar |e 
Vw, - 15)? +(w2 - 7) = 1 


(w, — 15)? + (Ww. - 79 = 1 
(wy = 15 = 1= (ee =7)P (7.5.8) 


9 
=) 


There are infinitely many solutions. We may choose w arbitrarily as long as 1 — (w2 — 7) > 0 and solve for 
15+ 8 
15 


2 
the solution derived from the sketch satisfies (7.5.8) since (16 — 15)* = 1 — (7 — 7). 


w 1. For example, w2 = 7.5 and w; = 15+ V1—.52 = 15+ _ giving 


| as one solution. Note that 


Yes. Their dot product is zero: 
(—12.1u)’(0.12v) = —1.452u’v = 0. 


We do not need to have coordinates to draw this conclusion: 


al 
v2 
(-12.1u)"(0.12v) = (-12.1[ wu. +++ mm |)}O.12] 
Vn 
0.12 
0.12v 
= (| 104). 124m). 2. “1a, }) 
0.12V, 
= (—12.1u,)(0.12v,) + (—12.1u2)(0.12v2) + +++ + (—12.1un)(0.12¥,) 
= —1.452u,v, — 1.452uv2 — +++ — 1.452u, Vv, 
= —1.452(u,vy + uove +--+ + UnVyn) 
= -1.452u’v 


That they are orthogonal can also be argued geometrically. Scaling a vector does not change its direction, 
so scaling u and v does not change either of their directions. If they are orthogonal to begin with, they are 
orthogonal after scaling. 


11: The code 


v = D.row(2) 
print (v.norm()) 
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produces 
sqrt(177) 


Note the same result can be reached with the single line print (D.row(2) .norm()). 


Section 1.5 


1c: According to formula (1.5.4), 
C2, = (-1)?*!det(—6) = — det(—6) 
le: According to formula (1.5.4), 


nae oe AOS 2 SS. a0 
Cis =(-D cer( eae J = aer( 2 


2c: The determinant of a 1 x 1 matrix is the lone entry of the matrix, so det(30) = 30. 


2g: Using formula (1.5.1), or equivalently formula (1.5.3), 


18 35 
14 -16 


| = 18(-1)!*! det(—16) + 5(—1)!*? det(14) 
= 18(-16) — 5(14) = -358 
2k: Using formula (1.5.1), or equivalently formula (1.5.3), 
=3 =] =9 
det] 1 -4 -8 | =-3(-1)!*' det ne 8 + (-1)(-1)!*? det ea 
9 6 2 6 
2 © § 
143 1 —-4 
+ (—9)(-1) det( 5 ) 


= —3 (—4(6) — (—8)(9)) + 16) — (-8)(2)) — 9 19) — (-4)(2)) 
= —3(48) + 22 - 9(17) = -275 


20: Using formula (1.5.1), or equivalently formula (1.5.3), 


: : : , 2 4 8 -2 
det =5(-1)'*! det} -1 6 O |40+2(-1)'*? det] 0 -1 0 
Ce) ed q. =] 3 0 3 3 
0 3 -1 3 
4° 8 6 
ssc) 0 -1 6 | 
0 3 -1 


6 O 1 
= 5(saer( 13 J- 6aer( 


3 
-1 0 0 
+2(4aer( 3 3 J-saer( 0 


-l1 6 0 0 -l 
- 8 4cr( 3-4 J-sacr( j | }-+6aer( 3 ) 


= 5(8(18 — 0) — 6(—3 — 0) — 2(1 — 18)) 
2 (4(—3 — 0) — 8(0 — 0) — 2(0 —- 0)) 

— 8(4(1 — 18) — 8(0 — 0) + 6(0 — 0)) 
= 5(196) + 2(—12) — 8(—68) = 1500 
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Section 1.6 
1c: The | x 1 identity matrix is [ 1 |. so we need a matrix M such that 
ML we l=L a |M=[ 1]. 


There is no formula. We just have to recognize that the matrix in question is the | x | matrix with its lone entry 
equal to the reciprocal of a ; 


1g: Using formula (1.6.2), 


| - E 11 | 
—-5 4 5(4) — (-3)\(5) - 3)(-5) | Ci2 Co2 
5 
1 


ae orhety 


— Ul 


li: . pe v7 is not a square matrix, so it does not have an inverse 

“112 v2 5 _ 
6 3 0 

Im: Letting M=] -1 -1 6 | and using formula (1.6.2), 

0 O 7 

1 Cir Coy Cay 

“= 1 Cir Cra C32 

Gut Ci3 C23 C33 


and 


det M = 6aet( =4 : | = : = 6(-9- 3-0 = - 


0 0 

Ci, = dt a 7 jaa 

Cay =~aet{ 6 7 \ao2d 

Ca, = det = ¢)=18 

Cia =~ der a + \e7 
Can = det( 7 \=#2 
Cra =~ det Z 5 7-36 
Cia = det . ' }=0 
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Solutions to Selected Exercises 


9: £9) Sagetath Cell] 124 One way to complete the code is 


A = matrix(3,3,[7,-5,-2,-3,3,1,-3,2,1]) 
B = matrix(3,3,[1,2,2,2,8,7,-3,-5,-5]) 

# Compute (AB)4-1 

print("(a)") 

print ((A*B).inverse()); printQ 

# Compute A4-1 BA-1 

print("(b)") 

print (A.inverse()*B.inverse()); print() 
# Compute B4-1 AA-1 

print("(c)") 

print (B.inverse()*A.inverse()) 


which produces 


(a) 
[-11 -7 -17] 
[-20 -13 -30] 
[ 26 17 39] 
(b) 
[-2 0 -1] 


Therefore 
; [oe =P 2 , 1 
~ 0 0 =3 0 0 
1.4 -70 80 864.9 
5: Applying equation (1.6.3) withA =| -—29 95 |andAB=| -62 —52 |, 
-12 -43 —32 52 
80 4.9 1.4 
-62 -52 |-B'=(AB)B! =A=| -29 
-—32 52 -12 
This operation can be thought of as right-multiplying both sides of 
14 —70 80 4.9 
—29 95 |-B=| -62 -52 
-12 -43 -—32 52 
by BT: 
1.4 -70 80 4.9 
-29 95 |-B-B!=| -62 -52 
-12 -43 —32 52 
which reduces to 
1.4 -70 80 49 
-29 95 |-1=] -62 -52 |-B! 
-12 -43 —32. 52 
and finally to 
1.4 -70 80 4.9 
-29 95 |=| -62 -52 |-B. 
-12 -43 -32 52 
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[-25 2 -7] 
[58 -5 15] 
(c) 

[-11 -7 -17] 
[-20 -13 -30] 
[ 26 17 39] 


10: £3) Sagelath Cell] 125 One way to code the computation is 


encoder = matrix(3,3,[1,-4,-2,-3,7,3,0,2,1]) 


decoder = encoder.inverse() 


message = matrix(3,5,[-589,-602,-244,-546, 33, 
861,958,224, 768,-99, 
339,317,180, 325,0]) 


print("Encoded message:") 
print (message); print() 
print("Decoded message:") 
print (decoder*message) 


which produces 


Encoded message: 

[-589 -602 -244 -546 33] 
[ 861 958 224 768 -99] 
[ 339 317 180 325 0] 


Decoded message: 

[ 89 32 116 104 33] 
[111 103 32 105 9] 
[117 111 116 115 0] 


and these numbers are ASCII codes for “You got this!”. 


Section 1.7 


le: Sin 


—4 1 
Av=| 2 0 
—4 -] 


—2 


0 0 
Since | 2 | = | 1 the eigenvalue 2 must be 2. 


URES 


2d: The characteristic polynomial of a square matrix M is det(M — AJ). In this case, 


val] | 


1 
0 


0 
1 


oo 
|) = er( g <D=4 


= (-8— AY(-2- 2) +9. 


| 


ce V is an eigenvector of A, it must be that Av = Av for some scalar 2. Computing Av will reveal the value of 
a: 


The characteristic polynomial is (-8—A)(—2—A)+9 and can be expanded to yield the standard form 27+ 102+25. 
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3d: The eigenvalues of a square matrix A are the roots of its characteristic polynomial. That is, solutions of the 
equation det(A — AZ) = 0. In this case, 


“(225 te 


-1 3-A 
(-7 =YG-A+ 25 =0 
2 +414+4=0 
(A+2)? =0 
Aa? 


so this matrix has one eigenvalue, 2. 


3k: The eigenvalues of a square matrix A are the roots of its characteristic polynomial. That is, solutions of the 
equation det(A — AZ) = 0. In this case, 


(-3 - daet( 16 15a 
(-3 — A) (C13 — ay(-15 — A) + 192) = 0 
(-3 — A)(A? + 2A- 3) =0 
-2? - 52? - 34 +9=0. 
Checking for integer roots first, the rational roots theorem says the only possibilities are factors of the constant 
term, 9 divided by factors of the leading coefficient, —1. That is, +1,+3,+9. Starting with 1, -(1y - 5(1)? - 


3(1) +9 =-1-5-3+9 =0. How lucky, a hit on the first try! Factoring A— 1 out of —2 — 52? — 32 +9 using 
synthetic division: 


1j-1 -5 -3 9 


-l1 -6 -9 


-l1 -6 -9}] 0 


yields —A? —5A?-3A+9 = (A-1)(—A* -6A—9). Completing the factoring (factoring the quadratic —A* — 6-9) 
gives the characteristic equation 


(A 1)(-A - 3)(a + 3) = 0 
A= -3,1 


so this matrix has two eigenvalues, —3 and 1. 


4c: The eigenpair A, v satisfies the equation Av = Av, which can be solved for v. Letting v = | i | 
-4 2 


2 
al a 
-16 8 v2 ~ 


0 
—4y,+2v2 |_| 0 
-16v; + 8v2 ~| 0 
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so v; and v7 must satisfy the system 


—4y, + 2v. = 0 
-16v; + 8v2 =0 


Solving the first equation for v2 as an attempt to solve the system by substitution: 2v. = 4v; so v2 = 2y,. 
Substituting into the second equation, —16v, + 8(2v,) = 0 yields 0 = 0, a true statement for all values of v,! 
This means v, can be anything and v7 must be 2v,. For example, 


: | is an eigenvector 


r |. : : 
| is a valid solution. 


but any vector of the form dy 


11d: £3) Sagelath Cell] 126 One possible solution is 


M = matrix(2,2,[-8,-3,3,-2]) 
print(M); printQ 
print (M.charpolyQ) 


which produces 


[-8 -3] 
[ 3 -2] 


x42 + 10*x + 25 


The solution of 2d was A? + 102 + 25, which is the same except for the (dummy) variable, so is the same 


solution. 
Section 2.1 
3b: The linear system represented by the matrix is 
lly, =9 
Sv =—-7 
V3= -13 
—2v4 =6 
so the solution is vy = 2. V2 = -i, v3 = -13, v4 = -3. 
3d: The linear system represented by the matrix is 
lly, + 9v3 = 12 
—8y> = 4y3 =-1l 
W3= 2 
Starting with v3 = 2 and substituting into the other equations yields —8v. — 4(2) = —1 so v2. = -} and 
11v; + 9(2) = 12 so v; = —&. The solution is therefore vj = —&, v2 = —J, v3 = 2. 


9c: Adding 4 times row 2 to row 3 of the identity matrix yields the given matrix: 


1 0 0 dinate se. 1 0 0 
0 1 0 = 0 1 0 
0 0 1 0 4 1 


so the elementary row operation must be adding 4 times row 2 to row 3. 
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12b: The first row of the left matrix holds the coefficients of the linear combination so the first row of the product 
must be 


-2[4 -3 2]+1[ 12 8 -5 |=|-8+12 6+8 -4-s]=[4 14 -9] 


Section 2.2 


la: Answers will vary depending on the steps taken. For a matrix with two rows and no columns of zeros, the only 
requirement of row echelon form is a 0 in the 2,1-entry. One way to reduce is by first scaling the two rows and 
then replacing the second row: 


—2 -4 ~10 ] -5A 41; 
-5 2 —l 


Le? 10 20 50 Anata Ad, 10 20 50 

2A,.>A,, | —10 4 -2 0 24 48 

1d: Answers will vary depending on the steps taken. For a matrix with three rows and no columns of zeros, row 
echelon form requires zeros in the 2,1-, 3,1-, and 3,2-entries. One way to reduce is as follows: 


“10 2 4A, ,+A2;A, “1 0 2 -Ao.+A3:A3; “1 0 2 
4 1 -3 —_ 0 1 5 — 0 1 5 


1 1 3 0 1°55 0 0 0 


Aj, +4343, 
2a: Reduced row echelon form requires ones in the pivot positions and zeros above. Beginning with the echelon 
form of question la, the pivot positions are the 1,1- and 2,2-entries. 


10 20 50 
0 24 48 


1 
2 


mArA: [| 1 2 5 
tell fk, -2 


-2A2;+A1;A1; 1 0 
37 A2:Ad,: 01 
2d: Reduced row echelon form requires ones in the pivot positions and zeros above. Beginning with the echelon 
form of question la, the pivot positions are the 1,1- and 2,2-entries. The only thing remaining to do is get a one 

in the 1,1-entry. 


-1 0 2 hese: 1 0 -2 
0 1 5 — 0 1 5 
0 0 0 0 0 0 
3a: A homogeneous system has zero constants, so the associated system is 
3x1 = X2 = 0 
5x2 = 0 


The second equation requires x. = 0 and substituting this value into the first equation reveals that x, = 0 also. 
There are no nontrivial solutions. 


3f: A homogeneous system has zero constants, so the associated system is 


5x] = 3x3 = 0 
—4x5 + x3 = 0 
0 = 0 
Answers will vary since there are infinitely many solutions, x; = 12, x2 = 5, x3 = 20 is one example. The 


third equation is always true (no matter the values of x1, x2, x3). The second equation requires x2 = 4X3 and the 


first equation requires x; = 2x5. Any solution where these requirements of x; and x2 are met will suffice. For 
example, choosing x3 = 20, we get x. = 5 and x, = 12. 


4b: Putting the coefficients in an augmented matrix and row reducing to reduced row echelon form: 


1 4 -4 1 0 64 
-3 -1l -5 0 1 -I7 


3A1;+A2:—Ap2:;: 


1 4 -4 —4A2:+A:A, 
0 1 -17 ane 


The reduced row echelon form represents the system v, = 64, v2 = —17 (the solution). 
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4h: Putting the coefficients in an augmented matrix and row reducing to reduced row echelon form: 


-3 -35 10 2 en ey -3 -35 10 2 ae -3 -35 10 2 
9 130 -40 2 0 25 -10 8 — 0 75 -30 24 


9 120 -35 —-4 0 15 -5 2 0 -75 25 -10 
-3 -35 10 2 eon -35 0 30 s* 


= 
3A1,;+A3,;A3;; —5A3,.A3; 


O 75 -30 24 75 OO -60 — 
0 0 -5 «#14 


=o OO oe a cua | oe 2 
—= 5 0 -4 
0 


Az; +A3,,A3,; 
—= = 
2A3,:+A1:—AL:; 


0 > 0 -4 
0 0 -5 14 


Scaling each row appropriately produces 


100 -% 

0 10 -3 

4 

00 1 -# 

from which the solution is clearly vj = -i, v2 = -, V3 = -, 


Section 2.3 
4b: Begin by reducing: 
19 
2 4 5 Bie cudesit, 2 0 19 8s Sdiiy: 10 5 
0 2 -7 — 0 2 -7 | —- 0 1 -} 
0 =) 7 Az, +A3,;7A3,; 0) 0 0 3A2:A2 0 0 (0) 
which means x3 is a free variable and the solution is 
— 19, _7 
xX, = 2 X35 x2 = q*3- 
In parametric vector form, using r for the arbitrary parameter: 
X{ -? 
X2 =P i ¢ 
X3 1 
Equivalently, 
XxX] -19 
Mm |=s 7 : 
X3 2 
5c: Eigenvectors are solutions of the system (A — AJ)v = 0, for each lambda. 
12 -4 16 
A=-6: A-(-6)J=| 3  -1 4 | which reduces as follows: 
-6 2 -8 
12 -4 16 ee 3 -1 4 “iti tell ples, 3 -l1 4 
3 -1l 4 — 3 -l 4 as 0 0 O 
-~6 2 -8 -~6 2 -~8 2A, +A3,,7A3 0 0 0) 
Hencev=| v1 v2 V3 I must satisfy only 3v; — v2 + 4v3 = 0. v2 and v3 are free and v; = iv2 - 43. 
In parametric vector form, 
V1 1/3 —4/3 
v=|vwt=r| 1 |t+s 0 . 
V3 0 1 
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Solutions to Selected Exercises 


9 -4 16 
A=-3: A-(-3).=| 3  -4 4 | which reduces as follows: 
-6 2 -Ii1 
9 -4 16 isle 3 -4 4 eT eee 3 -4 4 Lies 
3 -4 4 — 9 -4 16 oy 0 8 4 — 
6. 2. ii =6 2 11 | MeO | oe oe | ate 
3 -4 4 aes 3 -4 4 aera, 3 0 6 lig al 1 0 2 
0.2 41 — 0 2 1 _ 2 1 1 1g 
o. 2. 4 0 0 0 0 0 0 [2 "*|0 0 0 
Hence v =| v4 Vo vs |’ must satisfy vy = —2v3 and vp = —4y3. vy is free. In parametric vector form, 
V1 —2 
V=| wy |=t -5 
V3 1 
1/3 —4/3 —2 
In summary, eigenvectors take the formr| 1 |+5 0 or the form t¢ -} . 
0 1 1 


14: One way to complete the code is as follows. 


M=matrix(5,5,[2049,-4548,-511,-5177, 6023, -4526,10252,916,11438, 
- 13292, -6947,15538,1740, 17601, -20614, -1388, 2866, 


263, 3166, -3697,-5781,12812,1211, 14321, -16671]) 
print("M ="); print(™); printQ 
# Find a row echelon form (but not reduced row echelon form) 
print("Row echelon form:") 
print (M.echelon_form()); printQ 
# Find the reduced row echelon form 
print("Reduced row echelon form:") 
print (M.rrefQ) 


which produces: 


— 

[ 2049 -4548 -511 -5177 6023] 
[ -4526 10252 916 11438 -13292] 
[ -6947 15538 1740 17601 -20614] 
[ -1388 2866 263 3166 -3697] 
[ -5781 12812 1211 14321 -16671] 


Row echelon form: 

1 38 102 149 -184] 

134 67 402 -201] 
© 134 268 -134] 
0 © 804 268] 


[ 
[ 
[ 
[ 
[ 0 0 0 0] 


0 
0 
0 
0 


Reduced row echelon form: 


[ 1 0 6. 86 -1/3] 
[ © 1 0 6 -5/3] 
[ © 6 1. 6 -5/3] 
[ © 6 6 1 1/3] 
[ 60 6 6 6 68] 
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Section 3.1 


2b: For any r x s matrix L and any s x t matrix R, the i, j-entry of LR is 


(LR);,; = Li1R1,j + Li2R2,j) + 


++ LisRs j- (7.5.9) 


Now suppose A is ¢ X m, B is m Xn, and C is n x p. Then applying formula (7.5.9) with various substitutions 


for matrices L and R and counts s: 


(A(BC)); ; = Ais(BC)1,; + Ai2(BC)2,; + 
=A (BiiCi, + By 2Co,j + 

+ Aj (Bo1Ci,j + By2C2,j + 

+:+++Aim (BniC1.j + Byn2Coj + 


and 


((AB)C), ; = (AB)i1C1,; + (AB)i2C2,j + 
= (Aj1Bi1 + Aj2Bo1 +--- 

+ (Aj1Bi2 + Ai2Bo2 +--- 

tee + (Ap Bin + Ai2Bon + 


era Ajim(BO)m,j 
eo BinCn,) 
ae BonCn,) 


eer Binns) (7.5.10) 


ot (AB) inCn,j 
+ AimBm,1) Ci) 
+ AimBm2) C2, 


eae AimBnn) Ch,j (7.5.11) 


By inspection, (7.5.10) and (7.5.11) both contain the mn terms 


AixBryCy,j 


and therefore are equal. 


KE 1,2». 2: 


,mandy = 1,2,...,n 


2g: For the arbitrary matrix M, (mM le = M;,;, which follows from the definition of the transpose. Applying this 


observation twice, 


((4")’),,= 


LJ 


(4”) = Ajj. 


3a: By the definition of inverse, (1.6.1), it must be shown that the product of the two matrices in either order is the 


identity. 
To show that (B-'A7!)(AB) =I: 


(B1A"!)(AB) = ((B1A)A) B 
= (B-(A~!A) B 

= (BN B 

=B'B 

=] 


To show that (AB)(B7!A7!) = Tis similar: 


(AB)(B-1A-!) = A(B(B"!A“)) 
Ga Ly A- ') 

A(IA“!) 

= AA7! 

=] 


theorem 2 claim 4 
theorem 2 claim 4 
definition of inverse, (1.6.1) 


theorem 2 claim 4 
definition of inverse, (1.6.1) 


theorem 2 claim 4 
theorem 2 claim 4 
definition of inverse, (1.6.1) 


theorem 2 claim 4 
definition of inverse, (1.6.1) 


6: According to theorem 3 part 1, 3A + 7A = 10A, so the calculation can be done without calculating 3A or 7A like 


So: 


11 10 
3A-+7A = 10A = 10| La ig 
110 100 -60 
You can check that 3A + 7A = | 30 60 70 


-6 


-| 110 100 -60 


7 -30 60 70 


| by calculating 3A and 7A and adding them. 
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Section 3.2 
1c: One way to proceed is to subtract from both sides and then multiply both sides by —1: 
0 =-19 |.) 0 = _ye =2- 12 | | 0 -=19 
-1 8 -l 8 ~ | -14 20 -l 8 
—2 38 
- -| -13 12 | 
—2 38 
“1-x)=-1(] -13 12 } 
2 -38 
x=| 13) -12 | 
2b: Left-multiply both sides by a multiplicative inverse. There is no division of matrices. 
S 2) [2 2lesl2.21 1 & = 
6 3 6 3 “| 6 3 -19 7 
1} 3 -2 13. -13 
ix=5| 3, 5 lee 7 | 
gull 7 8B) es 
3] -173 113 B e 


3d: Right-multiply both sides by P and left-multiply both sides by P™!: 


(PDP"') P= AP 

(PD)(P"'P) = AP 

PD = AP 
P'(PD) = P"'AP 
(P"'P)D = P"'AP 
D=P'AP 


Since P~! appears in the equation being solved, it is assumed to exist. 


6c: The second row of a matrix product is the linear combination of the rows of the righthand matrix with coefficients 
coming from the second row of the lefthand matrix. In symbols, (AB)2. = Az1By,, + Ao2Bo, + +++ + ArnBn: 
(assuming A has n columns and B has n rows). Applied to this question, 


o[ -3 3]-3[2 -4]+3[-5 5]+2[0 1] 
=[0 o]+[-6 12]+[-15 1s ]+[o0 2] 
=[-21 29 | 


7e: The third row of a matrix product is the linear combination of the rows of the righthand matrix with coefficients 
coming from the third row of the lefthand matrix. In symbols, (AB)3. = A3By,, + A3.2Bo,, +--+: + A3nBn.: 
(assuming A has n columns and B has n rows). Applied to this question, 


4[ -3 3 ]+1[2 -4]+4[-5 5]-s[o 1] 
[ -12 12 ]+[2 -4]+[ -20 20]+[0 -s | 
=[ -30 23 | 
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8c: The second column of a matrix product is the linear combination of the columns of the lefthand matrix with 
coefficients coming from the second column of the righthand matrix. In symbols, (AB).2 = A.) Bi2 + A:2Bo2 + 
--++A.,Bn2 (assuming A has n columns and B has n rows). Applied to this question, 


—4 5 -3 —4 -12 —20 -15 —4 —51 
3) O |-4] -3 |}+5) 3 J4+1] 2 [= 0 +{ 12 |+] 15 [+] 2 |=] 29 
4 1 4 —5 12 —4 20 -5 23 


9c: The product has no third column since the righthand matrix has no third column. 


Section 3.3 


la: (solution 1): If the vectors are augmented to form a matrix, then theorem 5 applies. It gives 6 ways to show that 
the columns of a matrix are linearly independent, parts (ii)-(vii). If we can show any one of them true, we have 
shown that the columns of the augmented matrix, which are the given vectors, are linearly independent. The 
simplest route to a conclusion is to use part (iv) of the theorem—¥M has a pivot position in every column—as 
determining pivot positions amounts to doing some row reduction. 


Augmenting the vectors gives the matrix 


-1 5 
M= ee | | 
Note that the vectors have been augmented in an order that makes row reduction simple (a —1 in the 1,1-entry). 
This is consistent with the idea that linear independence is a characterisitic of a set, where order of elements 
does not matter. The row reduction can be completed in one operation: add —1 times the first row to the second 
row, which yields 
-1 5 
[oo 


At this point, it is clear the matrix has two pivot positions, one in each column. By theorem 5 the columns of 
M are linearly independent. 


Remark: We can also see at this point that M has no free variables—part (vii) of theorem 5—giving another 
way to conclude that the columns of M are linearly independent. 


(solution 2): The definition of linear independence can be used just as well. The definition revolves around 
linear combinations of the vectors that sum to zero: 

-1 |_| 0 

-1 |" | 0 


+ x2 


E 
XxX] 4 


which can also be written as 


or 


0 
| (7.5.12) 


As in solution 1, we choose to set up the system to make the row reduction simple. Writing the augmented 
matrix for (7.5.12) and reducing: 


“to Ol] pat Ss Faia h e Oy) 4 8 8 
-l1 4 0 0 -1 0 0 -1 0 0 1 0 


which means the (one and only) solution of the system is x, = x. = 0. By definition, the vectors are linearly 
independent. 
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1c: (solution 1): If the vectors are augmented to form a matrix, then theorem 5 applies. It gives 6 ways to show that 
the columns of a matrix are linearly independent, parts (ii)-(vii). If we can show any one of them true, we have 
shown that the columns of the augmented matrix, which are the given vectors, are linearly independent. The 
simplest route to a conclusion is to use part (iv) of the theorem—¥M has a pivot position in every column—as 
determining pivot positions amounts to doing some row reduction. 


Augmenting the vectors gives the matrix 


= 
M=| 2. -3 
ar 3 


Note that the vectors have been augmented in an order that makes row reduction simple (a —1 in the 1,1-entry). 
This is consistent with the idea that linear independence is a characterisitic of a set, where order of elements 
does not matter. Enough row reduction can be completed in just two row operations: 


-1 4 -1 4 
e <a) 0. 5 
A 33 0 7 


At this point, it is clear the matrix has two pivot positions, one in each column. By theorem 5 the columns of 
M are linearly independent. 


Remark: We can also see at this point that M has no free variables—part (vii) of theorem 5—giving another 
way to conclude that the columns of M are linearly independent. 


(solution 2): The definition of linear independence can be used just as well. The definition revolves around 
linear combinations of the vectors that sum to zero: 


1 4 0 
xy] 2 [+x] -3 | =| 0 
1 -3 0 
which can also be written as 
-1 4 0 
a. ie | =) 0 |, (7.5.13) 
at <2)" 0 


As in solution 1, we choose to set up the system to make the row reduction simple. Writing the augmented 
matrix for (7.5.13) and reducing: 


-l 4 0 -1 4 =O 1 -4 0 1 0 0 
2 -3 O}>} 0 5 O}>;0 1 O}f]>; 0 1 0 
-l -3 0 0 -7 0 0 -7 0 0 0 0 


which means the (one and only) solution of the system is x; = x2. = 0. By definition, the vectors are linearly 
independent. 


2c: We are trying to conclude that the system has at most one solution for any constants. Part (vi) of theorem 5 is 
exactly this statement, which means if we can show that any one of the other conditions of theorem 5 holds, 
we are done. Parts (iii), (iv), and (vii) are within reach. Each one follows from row reduction of the coefficient 
matrix of the system. To make the work a little easier, we rewrite the system as 


V3). oS v2 FZ 2v1 = by 
+ 2% + Ty, = Ly 
V3 = i) = Ay, = b3 


and reduce the corresponding coefficient matrix: 


-1 -1 -2 -1 -1 -2 -1 -1 -2 
0 2 7 Jr} 0 2 7 |>}] 0 2 =7 
1 -l -4 0 -2 -6 0 oO 1 


At this point it is clear that the system has no free variables (and that the coefficient matrix has a pivot in every 
column), so theorem 5 gives us that the system has at most one solution for any selection of constants. 
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2d: We are trying to conclude that the system has at most one solution for any constants. Part (vi) of theorem 5 is 
exactly this statement, which means if we can show that any one of the other conditions of theorem 5 holds, 
we are done. Parts (iii), (iv), and (vii) are within reach. Each one follows from row reduction of the coefficient 
matrix of the system. To make the work a little easier, we rewrite the system as 


-z + 3y - 6x = bg 
2z + Ty + 6x = by 
Tz + Sy + x = b 

+ y + 5x = ba 


and reduce the corresponding coefficient matrix: 


1 3 -6 =f 2 26 iy “A eb oe a; 
9 O. 13° 2g 0 1 5 0 1 5 
7 & 1-0) -O 36 40 7) O. 264i 17 oo 0 @ IT 
0 1 °5 0 1 5 @. 33. <6 i: 0. =7i 


At this point it is clear that the system has no free variables (and that the coefficient matrix has a pivot in every 
column), so theorem 5 gives us that the system has at most one solution for any selection of constants. 


3a: We are asked to show that the homogeneous system has only the trivial solution. Part (iii) of theorem 5 is exactly 
this statement, which means if we can show that any one of the other conditions of theorem 5 holds, we are 
done. Parts (iv) and (vii) are within reach. Each one follows from row reduction of the coefficient matrix of the 


system: 
1 8 - 1 8 
1 -5 0 -13 


At this point it is clear that the system has no free variables (and that the coefficient matrix has a pivot in every 
column), so theorem 5 gives us that the homogeneous system has only the trivial solution. 


3g: We are asked to show that the homogeneous system has only the trivial solution. Part (iii) of theorem 5 is exactly 
this statement, which means if we can show that any one of the other conditions of theorem 5 holds, we are 
done. Parts (iv) and (vii) are within reach. Each one follows from row reduction of the coefficient matrix of the 


system: 
6 3 -l 1 -4 1 1 -4 1 1 -4 1 
5 0 1 - 5 0 1 - 0 20 -4 - 0 20 -4 
1 -4 1 6 3 -l 0 27 -7 0 27 -7 
5 7 -4 5 7 -4 0 27 -9 0 O -2 


At this point it is clear that the system has no free variables (and that the coefficient matrix has a pivot in every 
column), so theorem 5 gives us that the homogeneous system has only the trivial solution. 


4b: Using the definition of linear independence, we need to determine the nature of the solutions of 
asin’ t + bcos’ t = 0 (7.5.14) 


where 0 is the zero function, not the number zero. This means the equation has to be true for all values of t! 
Attempting to solve the equation for b: 


bcos? t = —asin’ t 


sin? t 


cos? t 
b = —atan’ t 


This last equation is true for all values of ¢ (for which it is defined) only if a = 0, which forces b = 0. In other 
words, the only solution is a = b = 0. Since the only solution of (7.5.14) is the trivial solution, sin? t and cos? t 
are linearly indpendent. 
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5c: The double angle formula, cos(2r) = cos? t — sin’ t suggests that 
sin’ t — cos? t + cos(2t) = 0 


for all t. Thereby we have produced a nontrivial linear combination of sin’ t, cos” t, and cos(2t) that sums to 
the zero function, proving that sin” t, cos t, and cos(2t) are linearly dependent. 


6g: The question of whether the columns form a linearly independent set is a rewording of the question of whether 
the columns are linearly independent. By theorem 5, it is equivalent to determine whether the matrix has a 
pivot position in every column. If it does, part (iv) of the theorem is true, and therefore part (i) is true. If it does 
not, part (iv) of the theorem is false, and therefore part (i) is false. 


Since the matrix has 4 rows, it can have at most 4 pivot positions. Since it has 6 columns, this means there is 
certainly a column (at least two, actually) without a pivot position. Therefore part (iv) of theorem 5 is false. 
Equivalently, part (i) is false and the columns of the matrix are linearly dependent. 


7b: The question can be rephrased as a question about a linear system. By definition, the linear independence of the 
set hinges of the nature of the solutions of 


a[1 8 -11 |+o[/9 4 -7]4c]4 -2 2]=[0 0 oO]. (7.5.15) 
This equation can be rewritten using algebra: 


[ a 8a -lla |+| % Ab -7b |+| 4c a3¢ 2¢.|=| 0 0 0 | 
| @+9b +4c 8a + 4b — 2c -1la-7b+2c |=[ 0 0 0 | 


The only way for this last equation to be true is if a, b, c solve the linear system 


a + 9b + 4c = O 
8a + 4b - 2c = 0. (7.5.16) 
-lla - 7b + 2c = O 


By row reduction, 


1 9 4 0O 1 9 4 0 19 4 0 19 4 0 
8 4 -2 0};->] 0 -68 -34 O0}/>/]0 2 1 O}>]0 2 1 O 
0 2 0 0 


-ll -7 2 O 0 -92 46 0O 


At this point, it is clear that the system has no free variables. Therefore part (vii) of theorem 5 is true. The 
equivalent part (iii) is therefore true, so (7.5.16) has only the trivial solution. In turn, the trivial solution is the 
only solution of (7.5.15), so the set is linearly independent. 


18: To show that two statements are true requires showing that if one of them is true, so is the other, and vice versa, 
so there are two things to show. (i) If (a) is true, then (b) is true; and (ii) if (b) is true, then (a) is true. 


(a) => (b): [We start by assuming (a) is true and showing that (b) follows logically.] Suppose x = 8. Then 
x is a perfect cube since 2? = 8. Of course 6 < 8 < 20 so 6 < x < 20. Hence x is a perfect cube between 6 and 
20. 


(b) = (a): [We now assume (b) is true and show that (a) follows logically.] Suppose x is a perfect cube 
between 6 and 20. Because x is a perfect cube, it must be one of the numbers 1, 8, 27 or higher. The only one 
of those numbers between 6 and 20 is 8, so x must be 8. 
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Section 3.4 


5d: According to theorem 6 it is equivalent to show that the coefficient matrix has a pivot position in every row. This 
is done by row reduction: 


6 5 1 5 -1 1 1 -4 -l 1 1 -4 
3 0 -4 7 |- 3 0 -4 7 |- 0 3 -1 -5 
-l 1 1 -4 6 5 1 5 0 11 7 -19 

-! 1 1 —4 -1 1 1 —4 

>| 0 33 -l1l -55 |>] O 33 -I1 —-55 

0 -33 -21 57 0 O -32 2 


At this point it is clear there is a pivot in every row. Therefore the system has only the trivial solution. 


6e: According to theorem 6 it is equivalent to answer the question “Does the matrix have a pivot position in every 
row?” This is determined by row reduction: 


-18 1 -l 7 -6 2 6 3 -6 2 6 3 
-6 2 6 3};->] -18 1 -1 7}]>} 0 -5 -19 -2 
0 4 


12 -5 
-6 2 6 3 
>| 0 -1 12 ~~ #10 


0 O -79 -52 


At this point it is clear there is a pivot in every row. Therefore the rows form a lineraly independent set. 


6g: According to theorem 6 it is equivalent to answer the question “Does the matrix have a pivot position in every 
row?” Since the matrix has 5 columns, it can have at most 5 pivot positions. However the matrix has 6 rows, 
so cannot have a pivot in every row. Consequently the rows do not form a linearly independent set. 


17: What makes any matrix M upper triangular is that M;,¢ = 0 whenever k > ¢. That is, entries whose row number 
is greater than their column number are zero. After deleting the first row and some column of U, as shown 


below, 
wx) [wl fk) ok) Fe] lk 
O .~« + * oo * 
0 O .« * * 
0 0 O x * 
: : : : : * 
0 oO O |0 ne? 


there are two distinct regions of entries. Entries to the left of the deleted column have the same column index 
in U\;,; as they do in U. Columns to the right of the deleted column have a column index one less than they do 
in U. All entries in U\;,; have a row index one less than they do in U. In symbols [and the start of the proof’, 


Ursie if€<j 
U1 pee = io ; os 
Uxsiesr if > j 


Ifk > ¢,thenk+1>@€andk+1>€+1,s0 Ugsie = Uksier1 = 0. Hence (U\1,;)x,¢ = 0 whenever k > ¢. 
18: As long as j > 1, (U\1,j)1,1 = U2, = 0. 


19: [Commentary that is not strictly part of the proof wil be inserted in square brackets and bold italicized.] If U 
is a 1x 1 matrix, it is upper triangular and det ([U1,,]) = U1. [This establishes part (i) of the proof. The claim 
is true for the particular value n = 1.] Now assume that det U = U;U22--+ Ux, for some (arbitrary) value k 
greater than or equal to one and arbitrary upper triangular k x k matrix U. [That is, if U is an upper triangular 
k x k matrix and k > 1, then the proposition is true. To complete the proof, we must use this information to 
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prove that if U is a (k + 1) x (k + 1) upper triangular matrix then det U = U,,,U22-+++ Ugsixsi.] Additionally, 
suppose U is a (k + 1) x (k + 1) upper triangular matrix. By definition, 


det U = (-1)'*"'U,; det Ui. + (-1)!*?U 1.2 det U2 - + + (-1)'3 U1 3 det U3. 
Since U\;,; has a zero on its diagonal whenever j > 1, all terms except the first are zero. Therefore, 
det U = (-1)'*'Uj,; det U\11. (7.5.17) 


Since U\;, is ak X k matrix, its determinant is the product of the entries on its diagonal [this is the inductive 
hypothesis], so det U\;, = U22U33-+- Uss+ix+1. Substituting this expresion into (7.5.17), we have det U = 
U,,1U22U33-++ Ugsixsi, and the proof is complete. 


Section 3.5 


la: It is advantageous to expand along rows and columns with many zeros since corresponding cofactors will not 
need to be computed. They will be multiplied by the zero entry in the computation of the determinant anyway. 
In this case, the fourth row has three zeros: 


5 Pi : ; fee -4 1 

=4(1)]} -4 0 1 |=4(-2)-1) = 8(-4) = -32 
4 -9 0 -2 A oo 4 0 
0 00 4 


The 3 x 3 determinant is expanded along the first row since it contains two zeros. The 2 x 2 determinant, at this 
point, is probably as simple as having memorized the pattern (upper left times lower right minus upper right 
times lower left). 


2c: The determinant of any triangular matrix is the product of the entries on its main diagonal. In this case, 2:8-4-7 = 


448 
0 1 0 
3: (a)} 1 O O | is a swap matrix, so the determinant of the product is —1 times the determinant of the given 
0 0 1 


matrix (see justification on page 102): (—1)(32) = —32. 


1 0 O 
(b) | 0 1 O | is a scale matrix, so the determinant of the product is the scale factor, a times the deter- 


0 0 iz 

minant of the given matrix (see justification on page 102): a 32) = i. 
1 0 0 

(c)} O 1 -; is a replacement matrix, so the determinant of the product is the same as the determinant of 
0 0 1 


the given matrix (see justification on page 102): 32. 


3 0 0 1 0 0 

(d) The matrices} 0 1 O 0 1 O |Jareascale by 3 and areplacement. Their effect on the determinant 
0 0 1 18 0 1 

of the product is, altogether, to multiply it by 3: 3(32) = 96. 


Section 3.6 


1: The determinant of the given matrix is 120. (a) Row replacements do not change the determinant of a matrix, so the 
original matrix must have had determinant 120 as well. (b) Row replacements do not change the determinant 
of a matrix, and row swaps change the sign of the determinant. Therefore (det A)(—1)? = 120 so det A = —120. 
(c) Row replacements do not change the determinant of a matrix, and row scaling multiplies the determinant by 
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the scale factor. Therefore (det A)(6)(5) = 120 so det A = 4. Row replacements do not change the determinant 
of a matrix, row scaling multiplies the determinant by the scale factor, and row swaps change the sign of the 
determinant. Therefore (det A)(10)(—1)? = 120 so det A = 12. 


2b: The determinant of the given matrix is the product of the entries on the main diagonal (since the matrix is 
triangular): (—2)(—1)(5)(6) = 60. Row replacements do not change the determinant of a matrix, and row swaps 
change the sign of the determinant. Therefore (det A)(—1)* = 60 where & is the number of row swaps. Since 
(—1)* equals 1 or —1, detA = +60. 


5b: Row reducing: 


-12 12 |-t™:0M,:[ 1 1 | -14m.4+%.5m;] 1 1 
14 6 =< 14 6 =e 0 -8 


0 -8 
The triangular matrix was gotten by scaling with factor -4 and one row replacement. Hence the determinant 
of M satisfies (det M) (-+) = —8 so det M = 96. 


The determinant of a triangular matrix is the product of the entries of the diagonal, so de a = -8. 


4f: Row reducing: 


-ll1 -15 4 1 Mo. Mp, -ll1 -15 4 isch, -ll1 -15 4 
8 9 -4 — 88 99 —44 , O -21 -12 


as 

3 -3 5) -11M3:>M3: 33 33 22 3M,,.+M3,:>Ms3,, 0 —~12 -10 
-1M2,3M2, | Hi, ts : 6M2,+M3,>M3, | eS . 
on 0 7 4 —2 0 d 4 
2M3:7M3; | Q 42 -35 0 oO -11 


The determinant of a triangular matrix is the product of the entries of the diagonal, so 
-ll1 -15 4 
det] 0 , 4 |=7(11). 
0 0 -li 


The triangular matrix was gotten by scaling with factors 11,—11, -} i and three row replacements. Hence the 
determinant of M satisfies (det M)(11)(-11) (-3) (3) = 7(11)? so 


2 
jet = JAY) _ 
11(-11)(-7) 
4g: Row reducing: 
3 90 -308 -6 3. 90 -308 -6 
-3 -140 484 10 Mic+Ma:— Ma, 0 -50 176 4 Ma; 
6 210 -737 -16 Pa ne O 30 -121 —-4 | -5m,.3Mi, 
3 70 -231 —-4 ih eile 0 -20 77 2 
3. 90 -308 -6 3. 90  -308 -6 
0 -50 176 4 3M2,+M3. Ma; 0 -50 176 4 HTM Ms 
0 150 -605 —20 |2m.+m:30m,;| 0 O -77 —-8 
0 100 -385 -10 0 QO -33 -2 


3 90 -308 -6 3. 90 -308 -6 
0 -50 176 4 3Ms.+Ma.—> Ma. 0 -50 176 4 

0 0 -717 ~-8 0 0 -77— —-8 
0 0 231 = =14 0 0 0 -10 
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The determinant of a triangular matrix is the product of the entries of the diagonal, so 


3. 90  -308 -6 


0 -50 176 4 
det] 9 9 77g | = ~3)(50)(77)(10). 


0 0 0 —10 


The triangular matrix was gotten by scaling with factors 5,—5,—7, and three row replacements. Hence the 
determinant of M satisfies (det M)(5)(—5)(—7) = —3(50)(77)(10) so 


—3(50)(77)(10) 
det M = ————— = -3(10)(—11)(-2) = —-660. 
et M = a = 3-1-2) 


5b: Eliminating the 4 by row replacement and then expanding along the first column: 


| -1 5 -1 | -l 5 -l | 

0 6 -!l — 0 6 -!l 

4 -10 2 0 10 -2 

and 
-1 5 -l 6 I 
0 6 -!l =-0| 10 - [= caer +19) =2, 
0 10 -2 


Since the original matrix was only changed by row replacement, its determinant and the determinant of the 
resulting matrix are equal. Hence 


=]. >. = 
0 6 -l]=2. 
4 -10 2 


6e: Theorem 7 gives a number of conditions equivalent to invertibility of a square matrix M. Among them are (vi) 
M has a pivot position in every column; (vii) M has a pivot position in every row; and (xiii) det M # 0. If we 
can show any one of these, we will know that the given matrix is invertible. Row reducing: 


| ! 3 -4M,.+M).>M), / I 3 —M),.+M3:>M3. ! ! 3 
ae a 0 3 -1 a G3: <4 
0 3 20 0 3 20 0 0 21 


At this point, we can see all of (vi), (vii), and (xiii) are true. We may pick any one of them to explain why the 
given matrix is invertible. For example, the given matrix has nonzero determinant and is therefore invertible. 


Section 3.7 


lc: Augmenting the identity matrix and reducing: 


9 -7 1 0 ]-2m.4+m,9m,[ 1 -21 1 -2 ]-4Mi+mm,[ 1 -21 1 2 
4701 i 49 a = 0 91 -4 9 
mane 1 -21 1 -2 peo 10 4 ¢< | 

4 9 4 9 


1d: Augmenting the identity matrix and reducing: 


1G 3 2 Oe fe ES ON Be ae sae 
2 4A 0 2 oO) oS") oe eS Oo 1 Oo = 
3 1 2 © 0 4 io =o =e a Oo Gg |e 
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-8 i = ee M2,,0M3, =n | 204 ! 3M2,.+M3,.0M3, 
0) 3 -4 0 1 -1 0 -l 2 10 2 = 
Oo -1 2 10 2 0 3 -4 0 1 -l 


> oe oe Mg MMi ie J ee a M2:+M,;>M,; 
0 -1 2 1 0 2 — 0 -1 0 -—2 -1 -3 ty 
0 023 15 -M3,,+M2,;,>M2; 0 02 3 I 6 —M?,>M, 
5 1 7 
-8 0 0 -5 -—2 -7 linet. 100 3 4G g 
HO Ft 810 10 oe 1 3 
0 02 3 1 5 7/3:>-M3; | Q Qo H : 


2: Three facts are applied. 


[section 3.7] The determinant of a product is the product of the determinant [det(AB) = (det A)(det B)]. 


1. 
2. [section 3.7] The determinant of an inverse is the reciprocal of the determinant [det(A~!) = aI 7]: 
3, 


[section 3.5] The determinant of a transpose equals the determinant [det(A’) = det A]. 
(a) det(MR’ ) = (det M)(det R’ ) = (det M)(det R) = 2- 5 = 3 using facts | and 3. 
(b) det(M~'R) = (det M~')(det R) = =4, detR = 3 - + = 4 using facts | and 2. 
(c) det(MR™!)" = det(MR™) = (det M)(det R"!) = det M4, = 2-3 = 6 using facts 1,2 and 3. 


Section 4.1 


1b: Without any information to whittle down the list of properties to verify, all 10 of the properties from the definition 
of vector space must be verified. Let f,g,h be in V (letters commonly associated with functions are used to 
emphasize that these vectors are in fact functions) and let s,t be scalars. Two functions are equal if they take 
the same value (give the same output) on every point in the domain. 


1. f + g means f(x) + g(x) and is a function defined on [0, 1] since f(x) and g(x) are. Therefore f + g is in V. 


2. f(x) + g(x) = g(x) + f(x) for every x in [0, 1] since addition of real numbers is commutative. Therefore 
f+g=gHf. 


3. f(x) + (g(x) + h(®)) = (g(x) + f(x) +h) for every x in [0, 1] since addition of real numbers is associative. 
Therefore f + (g + h) = (g +f) +h. 


4. The function z(x) = 0 (whose graph is a horizontal line on the x-axis) is in V and z(x) + f(x) = f(x) for all 
x in [0, 1]. Therefore z is an additive identity in V. 


5. The function -f is in V since it is a function on [0, 1] and has the property that f(x) + (-f(x)) = 0 for all x 
in [0, 1]. Therefore f + (-f) = z so every element of V has an additive inverse. 


6. sf(x) is a function on [0, 1] because f(x) is. Therefore sf is in V. 
7. 1f(x) = f(x) for all x in [0, 1]. Therefore 1f = f. 


8. s (f(x) + g(x)) = sf(x) + sg(x) for every x in [0, 1] by the distributive property of real numbers. Therefore 
sf + g) = sf + sg. 


9. (s + OF(x) = sf(x) + tf(x) for every x in [0,1] by the distributive property of real numbers. Therefore 
(s+oOf = sf + ¢f. 


10. s(tf(x)) = (st)f(x) for every x in [0, 1] because multiplication of real numbers is associative. Therefore 
s(tf) = (st)f. 


2a: [To verify that S is a subspace, three properties need to be shown—that the zero vector is in S; that S is 
closed under addition; and that S is closed under scalar multiplication.] 


In R?,0 = : and : is in S since it is on the x-axis (the y-coordinate is zero). [This shows that the zero 
eo 8 XxX] F XxX] SX] 3 i 3 
vector is in S.] For any u = | inS y; =0,sou= 0 It follows that su = | 0 | which is in S since 
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it also lies on the x-axis (has y-coordinate zero). [This shows that S is closed under scalar multiplication.] 
Similarly, for any v = | = | in S yp = 0. Therefore u+ v = | a |+ . =| *! ee which is in S since 
2 


it also lies on the x-axis (has y-coordinate zero). [This shows that S is closed under addition.] 


2d: [To verify that S is a subspace, three properties need to be shown—that the zero vector is in S; that S is 
closed under addition; and that S is closed under scalar multiplication.] 


In P3(R), 0 = (z(x) = 0) and z(x) = Ois in S since it is a polynomial of degree less than three with roots at 3 and 
18. [This shows that the zero vector is in S.] For any pin S p(3) = p(18) = 0 and pis a polynomial of degree 
three or less. Since multiplying a polynomial by a scalar does not change its degree, (sp) is a polynomial of 
degree three or less for any scalar s. Moreover (sp)(3) = s (p(3)) = s-0 = 0 and (sp)(18) = s(p(8)) = s-0 = 0 
so sp has roots at 3 and 18. [This shows that S is closed under scalar multiplication.] Similarly, for any 
q in S q(3) = q(18) = 0 and q is a polynomial of degree three or less. Since the sum of two polynomials 
has degree no higher than the one with highest degree p + q is a polynomial of degree at most 3. Moreover 
(p + q)(3) = p(3) + qG) = 0+ 0 = 0 and (p+ q)(18) = p(18) + q18) =0+0 =0s0 p+ q has roots at 3 and 
18. [This shows that S is closed under addition.] 


3a: [To show that S is not a subspace, one of the three subspace properties needs to be shown false—that the 
zero vector is in S; that S is closed under addition; and that S is closed under scalar multiplication. The 
easiest way to do this is often by counterexample (an example showing that one of the properties does 


not hold in all instances).] : | isin S but —2 2 is not, so S is not closed under multipli- 


_| -6 

14 14 | | -28 

cation. Note: S contains the zero vector and is closed under addition, so the only property for which a 
counterexample can be found is closure under scalar multiplication. There are many counterexamples 


available. Only one is needed. 


3d: [To show that S is not a subspace, one of the three subspace properties needs to be shown false—that the 
zero Vector is in S; that S is closed under addition; and that S is closed under scalar multiplication. The 
easiest way to do this is often by counterexample (an example showing that one of the properties does not 
hold in all instances).] 0 = (z(x) = 0) is not in S since z(0) = 0 (has y-intercept 0, not 3). Consequently the 
zero vector is not in S. Note: S is not closed under addition nor closed under scalar multiplication either. 
There are many exceptions to the properties available. Only one is needed. 


spanS -{r| : |: rin} 
0 


| and all multiples thereof, including 0| : | = | Each multiple points 


da: By definition, 


so spanS contains the vector I 1 0 


in the same or opposite direction, and there is a multiple for every magnitude. Therefore all the points on the 


1 


. a 0 
line containing 0 | and 0 


| are in S and nothing more. In summary, S is the line passing through . 


and : 


4c: By definition, 


: +58 
1 


; |:nsinr, 


Equivalently, spanS = { : , | ‘ | tr, sin Rh since a matrix times a vector is a linear combination of 
5 


the columns of the matrix. Since has nonzero determinant (5 - 0 — 2 - 1) theorem 7 assures that 


2 

0 
1 0 

of S, so spanS = R?. 


| aa | | : | = b has exactly one solution for every b in R?. In other words, every vector in R? is in the span 
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5f: By definition, 


0 -!il . 
0 0 |:4.5,cinrh 


a 0 0 0 0c : a —-c ; : 
so spanS 1 0 0 (+ 0 > |-[o g [ieaein} = { 0 »b |:ab.einr}, Since a,b,c are 
arbitrary real numbers, this is equivalent to - i | > 71,172,713 in Rt, which is the set of all 2 x 2 matrices 


with a zero in the 2,1-entry. 


3t 3 3 3 
8: Since | -t | = | -l s= 1 -1 | *. and S$ contains all multiples “| -l or in other words all 
5t 5 5 5 


3 3 
linear combinations of vectors in | -1 | Hence the desired set is | -l |} 
5 5 
3t-—2s 3 —2 3 —2 
9: Since | -t+s | = | -l ]/+s] 1 S = | -l J+s] 1 [sins . This is the set of all linear 
Stt+s 5 1 5 1 
3 —2 3 —2 
combinations of | —1 | and | 1 |, the very definition of on -1 | 1 | Hence the desired set 
5 1 5 1 


7 10 1 
12b: By definition, spanS = ( 8 |+y}| 2 | :x,yin | so the question is whether | 2 | is in this set. In other 
9 —-6 3 
7 10 1 7 10 
words, can we solve the equation x} 8 |+y| 2 | =|.2 b Written in matrix form, that is | 8 2 | | : | = 
9 -6 3 9 -6 


1 
| 2 | and whether there is a solution can be determined by reducing the augmented matrix: 


3 
7 10 1] of 1 58-5 
8 2 21™"S"°10 66 -6 
9 3 


-6 0 0 O 
1 
Hence the system is consistent. There is a solution, and therefore | 2 | is in spanS. 
3 


13c: By definition, span {sin* 6, cos” 6 = {a sin? 6+ bcos? @: a,b in R} so the question is whether cos(26) is in this 
set. Since cos(20@) = cos? 6 — sin? @ = 1cos?@ + (-1) sin’ 6 [this is a standard trig identity—double angle 
formula], cos(26) is in span {sin* 6, cos” 6. 


14c: Since v = (78, 81, 84, 87, 90, ...) can be written as 
3(1,2,3,4,5,...)+75(1,1,1,1,1,...) 


Vv is a linear combination of the vectors in the given set. Since the span includes all linear combinations of its 
elements, it includes this one. Therefore the answer is yes. 
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15a: Implied in the question is that we are to treat the columns of the matrix as individual vectors. Therefore we are 


—240 
to determine whether} —406 | is in 
—416 


—499 —288 232 
x| -425 |}+y] -161 J+z] 125 |: x,y,zinR}. 
306 —348 141 


—499 —288 232 x 
But this set is exactly 4} -—425 -161 125 y |: x,y,z in R> so we need to determine whether 


306 -348 141 Zz 

—499 —-288 232 x —240 

—425 -161 125 y |=] —406 (7.5.18) 
306 -348 141 Zz 


—499 -288 232 -240 
has a solution, which can be done be reducing the augmented matrix | —425 -161 125 -406 |. Adding 


306 -348 141 —-416 


the one line A.rrefQ to the SageMath code produces 


[ 1 0 ® 1533968/1012773] 
[ 0 1 ®  2328386/337591] 
[ 0 0 1 10922888/1012773] 


for the reducded matrix, indicating that system (7.5.18) is consistent (has a solution). Yes, b is in the span of 
the columns of M. 


Section 4.2 


4d: (i) This is a subset of M>x2(R). (ii) The content of the final paragraph of this section (section 4.2) is that any 
vector that can be written as a linear combination of the others in a set can be removed without affecting the 


Pviesset ath —4 -19 -19 -—2 |_| -23 -21 —23 -21 |. li 
span of the set. In thiscase,} j, _43 [+] 44 _1 |=] 99 44 ]8°] 2g _y4 | 18a linear com- 
ee Sct : -4 -19 -19 -2 

bination of the others and we can remove it without affecting the span. Hence 4-231 14° -1 

is a subset with the same span. 

-4 -19 —23 -21 -19 -2 -4 -19 : . 
5d: 14 -13 |= 28-14 |7 | 14-1 |° | 14-13 | °8 be removed from the set without affecting 

: —23 —-21 -19 -2 ||. : : : : 

its span. Hence og 14 |\°| 14-1 is a (different) subset with the same span. Yes, this subset is 


a basis for the span because it is both spanning (as we know) and linearly independent (since it contains two 
elements which are not multiples of one another). 


6c: Let S = 414 2t,34+t-27,-5 + 497, -5t- 21°}. Since S is a 4-element subset of P2(R), which has dimension 3, 
we know that S$ is linearly dependent (theorem 9). Hence the equation 


A(l + 2t) + BG +t — 217) + C(-5 + 417) + D(-5t - 27°) = 0 (7.5.19) 
has nontrivial solutions. We need to find one. Expanding the lefthand side and collecting like terms: 


A+2At+3B+ Bt-2Br —5C +4CP —5Dt-2Dr =0 
(A+3B-5C)+(2A +B -5D)t+(-2B + 4C -2D)r =0 
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For this equation to be true (for all t, making the lefthand side equal to the zero function, the zero vector of the 
vector space), it must be that each coefficient is zero, giving the linear system 
A+3B-5C =0 
2A+B-5D=0 
—2B+4C-2D=0 


which we can solve in a number of ways. One solution is A = —1, B= 2, C = 1, D= 0. Hence 
11 4.26 428 4 ¢— 27) 4+ C5 4 47) 4 0(—Si- 27) =0 
and we can write (for example) | + 27 as a linear combination of the others: 
1422-26 44=27)- (C5 +4?) 


[we could have chosen to write (3 + ¢ — 2/7) or (—5 + 417) as a linear combination of the others just as 
well]. This means we can remove | + 2¢ from the set and retain its span. If the set of the three remaining 
vectors is linearly independent, it is a basis. If it is linearly dependent, we need to eliminate one of the vectors 
from {3 +t-2f,-5+4f,-5t—- ae}. Repeating the above process without the polynomial | + 2¢ amounts to 
eliminating A from the equations: does 


BG +t— 20?) + C(-5 + 40) + D(-5t - 277) =0 
have nontrivial solutions? We check as before by solving the linear system 


3B-5C =0 
B-5D=0 
-2B+4C-2D =0. 


One solution is B = 5, C = 3, D = 1. Hence 
§(3 + t- 2) + 3(-5 + 47°) + (-5t- 27) = 0 
and we can write (for example) —St — 2? as a linear combination of the others: 
-5t- 2 =-5(3+1- 27) -3(-5 +4r) 


[we could have chosen to write (3 + ¢ — 2/7) or (—5 + 41°) as a linear combination of the others just as 
well]. This means we can remove —5t — 27° from the set and retain its span, leaving two vectors (polynomials), 
(3 +f — 217) and (—5 + 477), that are not multiples of one another. Hence {3 +t-2,-5+ 4e) is a linearly 
independent subset of S with the same span—a basis for the span of S. The dimension of the span is two. 


12b: Since the dimension of R* is four, a linearly independent spanning set with 4 elements will be a basis. The given 
set (of columns of the matrix) has four elements, so we need to determine whether it is linearly independent 
and spanning. Upon inspection, columns two and three are identical (making each one a linear combination of 
the other vectors in the set), so the set of columns is linearly dependent and does not form a basis for R*. 


13b: Since the dimension of R° is six, a linearly independent spanning set with 6 elements will be a basis. The given 
set (of columns of the matrix) has six elements, so we need to determine whether it is linearly independent 
and spanning. By theorem 7, determining linear independence is equivalent to determining whether there is a 
pivot position in every column—done by row reduction. Adding the line print (M.rref()) to the provided 
SageCell produces the reduced matrix 


[ 1 0 0 0 ® -501/35] 
[: 0 1 0 0 ® 273/5] 
[ 0 0 1 0 ® 71/15] 
[ 0 0 0 1 0 -2] 
[ 0 0 0 0 1 4/3] 
[ 0 0 0 0 0 0] 
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revealing that the matrix does not have a pivot position in every row (or column). Hence the columns of the 
matrix are linearly dependent and do not form a basis for R°. [Had there been a pivot position in each 
column, we would have needed to address the question of whether the columns were spanning—theorem 
7 part (ix) provides the answer.] 


Section 4.3 


4: (a) Answers will vary. Any set that contains all the ordered pairs’ first components will do. For example, 
{1,2,3,4,5}, Z, Q, and R are four possible domains. 
(b) Answers will vary. Any set that contains all the ordered pairs’ second components will do. For example, 
{t, P.P,t, ap P5(R), P55(R), and C(R) are four possible codomains. 


(c) The range of a relation is the image of its domain. In this case it is {t, Pets \. You might think of it as 
the smallest possible codomain. 


6: (a) The domain of a function is the preimage of its codomain. In this case {1, 2,3, 4, 5}. 
(b) Answers will vary. Any set that contains all the ordered pairs’ second components will do. For example, 
{t, PP, tf P \, P5(R), P55(R), and C(R) are four possible codomains. 


(c) The range of a function is the image of its domain. In this case it is {t, ea ae #. You might think of it as 
the smallest possible codomain. 


8: (a) The statement of the question implies that 3 is a member of the domain and we are to figure out the corre- 


sponding member of the range. This is precisely what the formula is for: p(3) = 35 = 3. 


45 5 143? 10 
(b) p(23) = Trae? = 530° 
(c) a preimage of 2 is any number z such that p(z) = z, Any solution of 2 = 745 will do. The most immediate 
solution is z = 2. 
(d) the preimage of 2 is the set of all numbers z such that p(z) = 2. Solving 2 = eS 
(1 + 2”) = 5z 
2-52+27 =0 


(2-—z) -—2z) =0 
Fi = 1 : 2: 1 —fl 2 3 
The only solutions are z = 2, 5 so the preimage of ¢ is {3.2. (e) p (1, 2, 3}) = (3. z, a}. 
10: For any relation {(a1, bi), (a2, by), te89 (Qn; bn)} its inverse is {(b1, a), (b2, a2), ae) (bn, An) }- In this case, Rr = 


cei 


—2 | 3r a as a subset of R? x P2(R). 
12b: The image of a means T(a): 


0 


Section 4.4 


3a: [solution 1] From the table on page 141, the elementary matrices associated with this composition are 


1. reflection about the y-axis (horizontal scale by factor —1): 0 4 


1 0 
0 2 


3. horizontal scale by a factor of 2: : | 


cnn 


2. vertical scale by a factor of 2: 
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Altogether the standard matrix is the product of these elementary matrices right to left: 


1 0 -l O];]_ | —2 0 
0 2 Oo 1] | O 2 
Note: The order of multiplication in this particular example does not matter. These particular elementary 


matrices commute with one another. However, not all elementary matrices commute. Generally, the order of 
multiplication matters. 


2 0 
0 1 


[solution 2] The standard matrix can always be determined by finding the images of the standard basis vectors. 


| and I... maps to : } verifying that indeed | aN | is the standard matrix 


In this case, /..; maps to 0 2 


for T. 


Note: This method should be considered a double check when possible. It is not always so simple to determine 
the images of the standard basis vectors. 


4b: Transformations with the same standard matrix are the same transformation. Determining the standard matrices 
for S and T (using the chart on page 141) will answer the question. 


1. S: reflection about the y-axis (horizontal scaling by a factor of —1), a : } followed by rotation 7 
radians clockwise (—7 radians counterclockwise) about the origin, 
| cos (—2) —sin(-4) | _ | 2 = | 
in(-4)oos(-§) [|B 8 
N20 42 17 1 Q v2 N2 
2 2 = 2 2 
MOS | 0 1 | “| a 
2 2 2 2 
2. T: rotation x radians clockwise (-*2 radians counterclockwise) about the origin, 
cos (3%) - sin (-34) 2 2 : . ; . 
ae ae = ae a5 , followed by reflection about the x-axis (vertical scaling 
sin(-) _ cos(-7) nes 
1 0 
by a factor of —1), 0 -1 


i) 


1 0 42 32 2 4» 

2 2 = 2 2 

ea v2 v2 2 
2 


Since S§ and T have the same standard matrix, they must be the same transformation (despite the differing 
descriptions). 


Note: It is equivalent to see geometrically that the images of the standard basis vectors (the columns of the 
standard matrix) are the same for S and T. 


5b: Given are the images of the standard basis vectors, which are the columns of the standard matrix. There is no 
work to do but collect the images as columns of a matrix: 


-7.5  -5.2 

1 10.1 
-13.2 -4.3 
—3.7 43 


7a: (solution 1) The columns of the standard matrix are the images of the standard basis vectors. Writing the standard 
basis vectors as linear combinations of the given preimages will allow computation of the images of the standard 
41 -6 -5 
basis vectors. For example, to write /.,; as a linear combination of | —3 |,| 1 |,] —1 | it suffices to solve 
—2 0 1 
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the matrix equation 


41 -6 -5 a 1 
—3 1 -il b |=] 0 
—2 0 1 c 0 
which can be done by row reduction: 
41 -6 -5 1 See 100 1 
3 1 -1 0/"-5"]0 10 5]. 
—2 0 1 O 0 0 1 2 
Therefore, 
1 41 -6 =5 
O}=1) -3 }+5) 1 |4+2] -l 
0 =2 0 1 


and using the fact that T is a linear transformation, it then follows that 


ied Dod fb 
[es] 2 | 212] 


45 
standard basis vectors gives the second and third columns of the standard matrix. The whole computation is 


neatly done in this £3) Sage ath Cell] 127 . The standard matrix is 


si -| -76 -469 —858 | 


The first column of the standard matrix is hence a. Repeating the process for the second and third 


-45 -273 -S11 


Double checking, we can multiply A times each given preimage to make sure it matches with the given image. 


For example, 
- _| -76 -469 —-858 ie _| 7 
: ~| -45 -273 -S511 9 “| -4 


as required. 


(solution 2) We can also use the given facts more directly, and the definition of the standard matrix less directly. 


*}-Lab (ES be ]}12 


Given that | , this may be rephrased 


, and | 


e 7 = -13 : -9 
in terms of the standard matrix A as A-| —3 | = ,A-] 1 |= ,and A -l j= ; 
= —-4 0 -3 1 -13 
Combining this information into a single equation, 
41 -6 -5 7-13 9 
ee tS ds oe: a8 
—22 0 1 


-76 —469 | 


-1 
41 -6 -5 
7 -13 -9 
Therefore A = | = 1 “| “| -45 -273 -511 


Se | ee 


12: (a) As canbe seen by the following diagram showing a generic rotation, Rg(u + cv) = Rg(u) + cRo(v). 
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— sind 
cos 0 


cos @ 
sin@ 


(b) Ro.) = | 


| and Rg(I.2) = | | as seen below. 


(cos 0, sin 0) 


(c) Rou = | 


cos@ —sin@ 
sin@  cos@ 


(d) Letting the arbitrary vector : | have magnitude r and angle of incidence with the positive x-axis a: 


ie 


-| r(cos@cosa—sin@sina) | _ 


cos@ —sin@ 
sin6  cos@ 


cos@ —sin@ rcosa@ |_| rcos@cosa—rsin@sina 
sin@ cos@ rsina rsin@cos@ +rcos@sina 


rcos(a@ + @) J-R _ 
= Ng 
y 


r(sin@cosa+cos@sina) | | rsin(a+@) 
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13a: (i) Perhaps the simplest method for obtaining the standard matrix is to graph the action of the transformation on 
the standard basis vectors. Their images form the columns on the standard matrix. In this case, the following 
diagram demonstrates this action. 


and basis vector J.» maps to | = | so the standard matrix is : ES 


Basis vector J. ; maps to 0 


1 
(11) The image of ; | means T (| |) which is calculated ass 


lider ofl] 


1: qg, h, and g are one-to-one. 
q: The equation g(a) = b can be solved as follows, demonstrating that it has at most one solution: 


e=b>a=In(d) ifb=0. 

h: The equation h(a) = b can be solved as follows, demonstrating that it has at most one solution: 
Va=b>a=b' ifb>0 

g: The equation g(a) = b can be solved as follows, demonstrating that it has at most one solution: 


Bx-9= b= 3x=b4+99 x= 


To show that a function is not one-to-one, it is enough to provide a counterexample. 

f is not one-to-one since 0 and 27 are both in the domain and f(0) = f(27) = 0, demonstrating that the equation 
f(@ = 0 has more than one solution. 

p is not one-to-one since —5 and 5 are both in the domain and p(—5) = p(5) = 25, demonstrating that the 
equation f(a) = 25 has more than one solution. 


2: f and h are onto. 
f: The equation f(a) = b, where b is any member of the codomain R, is satisfied as follows, demonstrating 


that it has at least one solution. a = - i is in the domain of f and 


3/b b 
o=- if oah= 29-20 ab, 
2 2 


h: The equation h(a) = b, where b is any member of the codomain [0, co), can be solved as follows, demon- 
strating that it has at least one solution. a = b’ is in the domain of h and 


a= => Va=b. 
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To show that a function is not onto, it is enough to provide a counterexample. 

q is not onto since 0 is in the codomain, but the equation g(a) = 0, which is to say e“ + 1 = 0, has no solution 
in the domain, R. 

p is not onto since —3 is in the codomain, but the equation p(a) = —3, which is to say a” = —3, has no solution 
in the domain, [0, oo). 

g is not onto since 3 is in the codomain, but the equation g(a) = 3, which is to say cos(a) = 3, has no solution 
in the domain, [0, 5. 


6: Only (a). First, there are multiple elements of the domain with the same image. For example, 


T (1,2, 3,4, 1,2,3,4,...)) = 1,2,3,4 
and 
T (1,2, 3,4,5,6,7,8,...)) = 1,2,3,4 


(and therefore the equation T(a) = 1,2, 3,4 has more than one solution, violating the definition of one-to-one). 
Second, for example, 


T ((s1, 82, 83, 84, 1,1,1,1,1,...)) = 51, 52, 83, 54 


for any vector 51, 52, 53, 4 (and therefore the equation T(a) = b has at least one solution for every element of 
the codomain and T is onto). 
Third, 


T ((S1, $2, 83, $4, 85,...) + CAF, 12,13, 14, 155+-+)) 
=T ((s, + Cr], 52 + Cro, 83 + C3, 84 + C4, S5 + CP5,...)) 
=S, + Cr}, S82 + Cr, S3 + C3, S4 + C4 
=S1, 82, 83, 84 + Cr], Cro, C3, Cr4 


=S1, 82, 83, 84 + C(11, 72,173,174) 
so T is linear. 


9: (a) Yes. T(fit+ch) =efitch) =efit+cefr = T(fi) + cT(f2). (b) Yes. Let b be an element of the codomain 
and suppose T(f,) = b and T(f2) = b, which is to say the equation T(a) = b has more than one solution. Then 
T(fi) = T(r) so e* f(x) = e* f2(x). Dividing both sides of the equation by e*, we find that fj(x) = fo(). 
Therefore, in fact the “two solutions” must actually be one and the same (and we have shown that the equation 
T(a) = b has at most one solution). 
(c) Yes. Let b be an element of the codomain (a function) and set f(x) = woe Then T(f) = er = b(x) so the 
equation 7(a) = b has at least one solution (/). 
(d) Yes. T is one-to-one, onto, and linear. 


10c: The definitions of one-to-one and onto depend on the possible numbers of solutions of the equation T(a) = b, 
which in this case is 


6 =4 les by 
-15 15 . | =| bp (7.5.20) 
12. 9 2 b3 


(i) By theorem 5, equation (7.5.20) has at most one solution for each b precisely if the columns of the coefficient 
matrix are linearly independent. By inspection, we can see that they are (they are not multiples of one another), 
so T(a) = b has at most one solution for each b in the codomain and T is one-to-one. 

(ii) By theorem 6, equation (7.5.20) has at least one solution for each b precisely if the coefficient matrix has a 
pivot position in every row. This is, however impossible since there are 3 rows and at most 2 pivot positions, 
one for each column. Therefore T(a) = b has no solution for some elements b of the codomain and T is not 
onto. 

(iii) No. T is not onto and therefore is not an isomorphism. 
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Section 4.6 


la: The five properties defining an inner product must be verified. Direct computation will suffice. Each string of 
inequalities hinges on the algebra of real numbers/variables. 


1. cau = (| vi | 7 


uz 
2. If (u,u) = 0 then 2u3 + 3u5 = 0 so it must be that ur = u; = 0, which means uw, = u2 = O and therefore 
u = 0. On the other hand, if u = 0 then uw, = uz = 0 so (u,u) = Qui + 3u5 = 0. 


= 2uj,u, + 3u2u2 = 2ui + 3u5 >0 


3, cay) =( . |, } = Quins + Suny = 20 + Siam = (| v, | . |) = om 
_f| 4 Ww] VW ff} ui tMi Vali 2 
4. army] is +{ w]e \)={ eae |) = 200 +1901 #3 + wa) 


= (2m) + Bunya) + 2vini +30) = (| ie |, _ \)+(| ee || im |) = ayy +m 


u2 9) v2 


- uy V1 = cuy VI = = 
(cu, V) = (| i |, 5 | = ( Cis || me | = 2cuyVv, + 3cu2Vv2 = c (2uy, V1 + 3u2V2) 


nell fT Prem 


10a: By definition, two vectors are orthogonal if their inner product is zero. Calculating, 


ww =(| a fe )-[: 10 | 9 ele 10 |] *p [= 150- 150-0 


so u and v are orthogonal. 


llc: By definition, two vectors are orthogonal if their inner product is zero. Calculating, 


{u, v) = u(0)v(0) + u(1)v(1) + u(2)v(2) + u(3)v(3) 
= (3)(0) + (O)-D + (1) + (0)G3) = 0 


so u and v are orthogonal. 


12a: By definition, two vectors are orthogonal if their inner product is zero. Calculating, 


ty i of 1 5 V2" 
(u,v) = — { u(x)v(x) dx = — (sin x)(cos x) dx = —-— [cos x| 
Tt Jo T Jo 2n 0 


1 1 
= -— |cos?(2m) — cos?(0)] = -— [1 - 1] = 0 
a 2a 
so u and v are orthogonal. 
13a: By definition, two vectors are orthogonal if their inner product is zero. Calculating, 


2 7 || -2 -6]_[ 52 -40]_. 
23.) | 8 <4. - | 28 0 


(u,v) = (uv’), + (uv’),., =52+0=52#0 


uv’ = 


so u and v are not orthogonal. 
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19a: Given any vector v in a vector space, v— v = 0 so 


(0, v) = (v-v,v) by substituion 

= (Vv, Vv) + (-v,v) by inner product property 4 
= (v,v) +((-Lv,v) by exercise 22 of section 4.1 
= (v,v) + (-1)(v, v) _ by inner product property 5 


= (v, Vv) — (v,v) =0__ by algebra of real numbers 
Hence (0, v) = 0 and by inner product property 3, (v, 0) = (0, v) (which of course is zero). 
19c: Given any vectors u, Vv, w in a vector space, 


(u, V+ Ww) = (v+w,u) _ by inner product property 3 
= (v,u) + (w,u) by inner product property 4 
= (u,v) + (u,w) _ by inner product property 3 


Section 5.1 


1d: The question is equivalent to asking whether b can be written as a linear combination of the columns of M. In 
turn, this is equivalent to whether Mv = b has a solution. Attempting to solve Mv = b by row reduction of the 
augmented matrix: 


27 33. 6 | FAnrAn | 9 11 2 | -AtetA2:42,[ 9 11 2 
9 11 2 9 11 2 0 0 0 


shows that there are infinitely many solutions. Hence b is in the column space of M. 


2d: From the echelon form ' 4 | of M (see solution above) the only pivot position is in the first column so the 


27 


set containing just the first column of M, {| 9 


\ is a basis for the column space. 


3d: From the echelon form 


| of M solutions of the homogeneous equation Mv = 0 must satisfy 9v; +1 1v2 = 
11 
-9 


da: (i) Yes. R does not have a leading coefficient in the rightmost column. Hence the system Mv = b is consistent. 
(ii) The pivot positions of M are in the first, third, and fourth columns, so the first, third, and fourth columns of 
M comprise a basis for the column space of M: 


[eHah sh 


Note: Since the columns of M are in R? and there are three linearly indpendent columns, they must span all of 
R? so any three linearly independent vectors form a basis for the column space of M. (iii) In parametric vector 


0 O 


O ory, = —HV. In parametric vector form, the solutions of Mv = 0 are v = | 


1 
-9 


and therefore { 


is a basis for the null space. 


9 
form, the solutions of Mv = 0 are v =r . so one basis for the null space is 
0 
9 
-2 
0 
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9 


Note: Any set containing any nonzero multiple of a is also a basis for the null space. 


0 
4d: (i) No. R has a leading coefficient in the rightmost column. Hence the system Mv = b is inconsistent. (11) The 
pivot positions of M are in the first, second, and fourth columns, so the first, second, and fourth columns of M 


comprise a basis for the column space of M: 


187 99 —12 
0 -11 4 
—33 |’| -22 |?) 4 
—154 —77 12 
Note: Any other set of three vectors with the same span would also do. (iii) In parametric vector form, the 
7 
solutions of Mv = 0 are v =r a so one basis for the null space is 
0 
7 
=): 
11 
0 
7 
Note: Any set containing any nonzero multiple of FS is also a basis for the null space. 
0 


6a: (i) Because the first and fourth columns comprise a basis for the column space, there must be a linear combination 
of these two columns that sums to b. By inspection, 


-15 -7 —22 
1] 27 |+1) 42 |=] 69 |=b 
9 7 16 


1 
0 
(ii) From above, one particular solution of Mv = bis v = | O |. The solution of the homogeneous system 
1 
0 
4 


Mv = 0 is given by the equations v, = -tyy - Sy, - tus and v4 = Sys. In parametric vector form, 
i _8 7 
3 3 3 
1 0 0 
v=r] 0 {+s} 1 |4+t} O 
0 0 $ 
0 0 1 
Hence the general solution of Mv = b is 
1 8 1 
I -3 -3 3 
0 1 0 0 
v=] 0 ]+r] O {+s} 1 {+t} O 
1 0 0 § 
0 0 0 1 
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0 
11: (a) b is —2 times the fourth column of M so v = ; is one solution of Mv = b. 
—2 
(b) The general solution (the set of all solutions) of Mv = b is given by 

0 0 

_| 0 n 1 

me) be) A 

—2 0 


(c) The solution in part (a) corresponds to the general solution with r = 0. Picking any three other real numbers 
for the parameter in the general solution will provide three solutions different from the one in (a). For example, 
using r = 1, 2,3: 


v= > ’ 


0 0 0 
1 2 3 
1 2; 3 

—2 —2 —2 
are all solutions of Mv = b. 


15b: The eigenspace corresponding to 2 is the null space of M — AI. The dimension of the null space of M — AT is 
the number of free variables of the system (M — AD)v = 0, which can be found by row reducing M — Al: 


25 45 —-75 rere 5 9 —-15 eee 2-15 
> 9 =I5 a 5 9 —-15 —= 0 0 O 
15 27 —-45 343,73. 5 9 -15 —A,,;+A3,,7A3,: 0 0 0 


The dimension of the eigenspace is 2. 


Section 5.2 


5a: Finding coordinates for a vector v relative to a basis means finding a linear combination of the basis vectors that 
sums to the vector v. In this case we are to write (the vector) v = 3 — 4¢ + 5? as a linear combination of (the 
basis vectors) 3 — 4f and f’. This can be done by inspection: 


v=3-4¢+5f = 133-41) + 1(7) 


so the coordinates of v with respect to v are 1 and | and as a coordinate vector v = 1 


B 
6c: Finding coordinates for a vector v relative to a basis means finding a linear combination of the basis vectors that 


sums to the vector v. In this case we are to write (the vector) v = 


1 0 0 1 
0 0 0 2 


: ; as a linear combination of (the basis 
0 
1 


1 2 1 0 01 0 0 
3 ‘AB 0 |*2l 9 > |+3| 5 a 


1 
so the coordinates of v with respect to v are 1, 2, and 3, in that order, and as a coordinate vector v = | 2, | : 
3 
B 


, and 


> 


vectors) 


: This can be done by inspection: 


7e: Finding coordinates for a vector v relative to a basis means finding a linear combination of the basis vectors that 
sums to the vector v. In this case we are to write (the vector) v = (6, 3, —4) as a linear combination of (the basis 
vectors) (2, —1, 9), (6,3, —4), and (—8, 1, 1). This can be done by inspection: 


v = (6, 3, —4) = 0(2, -1, 9) + 16, 3, —4) + 0¢(-8, 1, 1) 
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so the coordinates of v with respect to v are 0, 1, and 0, in that order, and as a coordinate vector v = 


0 
1]. 
9 Ie 


8: Finding coordinates for a vector v relative to a basis means finding a linear combination of the basis vectors that 


sums to the vector v. 


(a) We are to write (the vector) v = = as a linear combination of (the basis vectors) _) and ) 
In other words, we need to solve the equation 
= 7 = 1 % 5 
aad) (ies a es (eee adi 
which is equivalent to solving 
1 5 Vj _ 7 
—2 -9 v2 ~ -12 
The solution can be found by row reduction: 
1 5 7 2M, ;+Mz:>M2,| 1 5 7 ae 1 0 -3 
=F <§ =19 a 64 2 ot 2 
7 1 : -3 
Therefore =-3 +2 and as a coordinate vector v = ; 
-12 —2 -9 2 i 
(b) We are to write (the vector) v = rs as a linear combination of (the basis vectors) - and EA 


In other words, we need to solve the equation 


which is equivalent to solving 


The solution can be found by row reduction: 


1 5 —2 | 2M,:;+M2:>M?; 1 5 -—2 | -5M2.4+M,: M1; 1 O 3 
a = 3 01 2 OSs" 

‘| 

—| ie 


3 |=] 2 )a] 3 


Note: There is no prescribed method for finding coordinate vectors. Any meethod of finding the desired linear 
combination will do, whether by inspection, row reduction, inverse matrices, or other tactic. This question 
could have been answered just as easily using matrix inversion, for example. 


Therefore 


| and as a coordinate vector v = 


13: The columns of the change-of-basis matrix [8]s (the conversion matrix from coordinates with respect to 8 to 


coordinates with respect to &) are the vectors of 8 written with respect to &. These are given in the statement 


ee Ae = -~_| -9 -5 
-2 -9 || 2 8, [= 129-52 =1 9 1B -| - 4 


of the question: [B]s = 


(a) [B]z' v= | anes | | = | which has the same coordinates as the answer for 8a. 


2 1 -12 
-] -9 -5 —2 3 : ‘ 
(b) [Ble v= > 1 3 |=| 4 which has the same coordinates as the answer for 8b. 


Note: This process is equivalent to solving the equation [8]sx = v by matrix inversion, which gives x = 
[Bz v, emphasizing that [sie = [6]g (the conversion matrix from coordinates with respect to & to coordinates 
with respect to 8). 
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19: [solution 1] The change-of-basis matrix [B]c (the conversion matrix from coordinates with respect to 8 to 


coordinates with respect to C) is the matrix whose columns are the vectors of 8 written with respect to C. The 
task then is to find linear combinations of the vectors of C that sum to the vectors of B: 


8) 213] 4,,) 2 

7 {7 7 [7 %1 4 
and 

Sel Sle (2 

6 |" 71°14 


These equations can be solved by row reduction, matrix inversion, or maybe even inspection. In any case, the 


solutions are 


oe | ana | 
~2 


1 
25: (a) Finding [v]g means solving 


By row reduction: 


8 
and thereforev=| |§ | . 
3 Ig 
(b) Finding [v]¢ means solving 


By row reduction: 


3. 2 2 
7 4 -4 
-8 
and therefore v = : 
13 
Cc 
(c) From question 19, [B]¢ = 
23 
[Ble(vlg =| 7 
2 


6 
53 


| respectively. Therefore [B]o = = 
2 2 
2 |. -8 4 5 
4 =Vv) 7 v2 61? 
i -l -l -2 = -l -l -2 
7 -6 -4 0 -13 -18 
13. 26 2 13. 0 8 
-13 -18 0 -13 -18 
2 | _ 3 2 2 
4 = V1 7 v2 4 |: 
- 1 0 -8 as 1 0 -8 = 1 0 -8 
3.2 2 0 2 26 0 1 13 
|so 
w= 20808 LL] 
8 3 (18 
3 =O alee (a) 26 i 


which equals (has the same coordinates as) the answer in part (b). 


33: [B]c is the conversion matrix from coordinates relative to 8 to coordinates relative to C. As a two-step process, 
converting from coordinates in 8 to coordinates in C can be done by converting first from coordinates in B to 


coordinates in & and then to coordinates in C. As a matrix equation, [B]c = [E]c¢ [B]g or (because [E]e = [C Ie 


[Ble = [C]g! [B]z. Solving for [C]s, we find [C]s = [Bl [Bigs Hence the basis C written with respect to the 
standard basis appears in the columns of [B]s (al: 


5 
9 


2 


[Clg = =5 


| 


and finally C = { — 26 


57 


2 


t,-3 


1 
—7 


t\. 


8 
1 


| 


1 
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34: Finding coordinates for a vector v relative to a basis means finding a linear combination of the basis vectors that 


sums to the vector v. In this case, the most immediate means to a solution is graphically. 


Finding [v]g: the violet grid marks coordinates with respect to 8. By inspection: 


v= 11b ~2b:s0v=| - | F 
=9 F 


Finding [v]c¢: the orange grid marks coordinates with respect to C. By inspection: 


v=5e1~5esov =| 2 |. 
—5 c 
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2 2 
1e: ||v|| = (3) + (3) = S # 1. Since the norm of v is not one, v is not a unit vector. Scaling by the reciprocal 
of its norm will normalize it: 
19 Ee ces 4 
17 19 om | 17 17 


15 


so [ Tei x | is the normalized vector (it is a scaled version of v with norm 1). 

2b: : | . | os | = 7(-8)+7(4) = —28 # 0 so the vectors are not orthogonal and therefore the set is not orthogonal. 
2 -3 2 -5 

2g: | 3 }-} -5 | =-1-15+16=0so] 3 | and] —5 | are orthogonal. However, it is not enough that one 
4 4 4 4 


pair of vectors is orthogonal. All pairs must be orthogonal for the set to be orthogonal. Checking the other two 


2 16 -5 16 

pairs, | 3 | | =) | = 32-15-17=0 os -5 J+} -5 | =-8+25-17=0. All three possible pairs 
4 17 4 

of vectors are orthogonal so the set is orthogonal. 


7 


3 


ao 


: (| i ; Pi = 7(-8) + 2(7)(4) = 0 so the vectors are orthogonal and therefore the set is orthogonal (with 


respect to this inner product). 


5b: (? —2t, 3P -—t- 2) = 0(-2) + (-1)(0) + 0(8) = 0 so 3 — t-2 is orthogonal to ? —2t. 


4 | | 2 
6a: projyv = Ae = : S eee “3 
ee cao” [STs Tiel Ble) = 
a) | 3 
11d: Coordinates of a vector v relative to an orthogonal basis can be computed by projecting v onto each vector of 
the basis: 
18 3 18 25 
5 |-| 8 5 +] -15 . > 
pe al ery 
3 3 5 98 5 25 25 9 931 9 
8 }-] 8 -15 -15 
3 =) 9 9 
and 


89 
98 

ov| aoe | . 
9 

8B 


12a: (solution 1) Following the orthogonalization procedure, the first vector of the orthogonal set is the first vector 


‘ 1 . —— ee 
of the given set, | it The second vector of the orthogonal set is the second vector minus its projection onto 
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the first vector, 


(SH. 
ska 


Therefore the result of orthogonalization is | a | : = }. 


-1 
(solution 1) The question did not request a specific orthogonal set. By observation, the span of the given 


[che 


set is R? so any orthogonal set spanning R? will suffice. For example, the standard basis | 0 


iL ff 


\ or any other set containing a pair of vectors whose dot product is zero. 


; . 3 : 
13a: The first vector of the orthogonal set is the first vector of the given set, | 4 However, an orthonormal set is 


requested, so this vector needs to be scaled to a unit vector: 


ee 


UL RUILS 
—_ 


3 _ 
1 |= 
The second vector of the orthogonal set is the second vector minus its projection onto the first, normalized, 


sete ol a8 |-{ ED E]=[ 3-5] 


Notice the calculation of the projection is slightly simplified since the denominator must be | as we are pro- 
jecting onto a unit vector. To finish, the new vector needs to be scaled to a unit vector: 


Salil seeal 1171 


-8 
6 


MIB 
BUT 


1 
V64 + 36 


3 
4 


Therefore the result of orthonormalization is { 


16b: According to the solution of question 2b, the vectors are not orthogonal. Orthogonalization of a single pair of 


vectors amounts to subtracting from one of the vectors its projection onto the other. Letting v = and 


wala | 
vemive| SAE Uf e213] [4 


7 
7 


Hence an orthogonal set with the same span as S is {| 


~~ 
es 
—— 
| 
No 
ae 
—— 


Normalizing each vector, 


=I 
~~ 
aS 
Il 
~ 
2|- 
N 
7 
~~ 
a6 
ll 
a 
<l-<1- 
Sis 
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and 


v2 


17b: According to the solution of question 7.5b, the vectors are orthogonal so the set S itself is an orthogonal set 
with the same span! Normalizing each vector, 


aa + 
and an orthonormal set with the same span is S$ is 1 v2 | ‘ | v2 I} 


and 


‘HIE 
Fr 


ae ele 
and an orthonormal set with the same span is S is 1 \3 | ; | 3 I 
5 al 


3 


Section 5.4 


la: Solutions may vary. Any matrix P with n linearly independent eigenvectors of M will diagonalize an n xn matrix 
M. In this case, the simplest answer is to list the eigenvectors in the order given as the columns of P. That is, 

2 

5 -2 


-1 
See lS 4 5 4][2 1]_ 1f-2 -1][ 5 4]f2 1 
rimp=| 3, 20 7|{5 -2}~ 9|-5 2 |] 20 7 || 5 -2 
2 1 ]__1[ -135 0 |_[15 0 
5 -2}> 9| 0 27]}"|0 -3 


1g: Solutions may vary. Any matrix P with n linearly independent eigenvectors of M will diagonalize an nxn matrix 
M. In this case, the simplest answer is to list the eigenvectors in the order given as the columns of P. That is, 


P= . Now 


__1[ -30 -15 
~ 9] 15 -6 


25 -5 
22 1 |.Now 
io 2 
> 6 <5] [at -20 301) 55 
P'mMp=|2 2 1 8 -17 30 }]/2 2 1 
i 2 4 -10 18]/1 0 2 
i[ 4 -10 15 . =—20° 30 1f 2.4 <5 
= ee. ee ae a7 a a ae | 
Sa ee | 102 
i[ 4 -10 15 12 15 -15 [18 0 0 6 0 0 
=3/-3 9 -I2]/12 6 3 |=3)/ 0 9 O}=/0 3 0 
2 5 -6 6 0 6 0 0 9 0 0 3 
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2a: The entries on the diagonal of the (diagonalized) matrix P~'MP are the eigenvalues of M. In this case, —3 and 
15. 


2g: The entries on the diagonal of the (diagonalized) matrix P~' MP are the eigenvalues of M. In this case, 3 and 6. 


3d: Yes. Ann Xn matrix is diagonalizable if and only if it has a linearly independent set of n eigenvectors. In this 
question we have a 2 x 2 matrix so we need 2 linearly independent eigenvectors. Eigenvectors corresponding to 
different eigenvalues are necessarily linearly independent. Can you see why? Hence, without calculating them, 
we know the matrix has two linearly independent eigenvectors (corresponding to the two eigenvalues). 


3e: AnnxXn matrix is diagonalizable if and only if it has a linearly independent set of n eigenvectors. In this question 
we have a 3 x3 matrix so we need 3 linearly independent eigenvectors. Eigenvectors corresponding to different 
eigenvalues are necessarily linearly independent. Can you see why? Hence, without calculating them, we 
know the matrix has two linearly independent eigenvectors (corresponding to the two eigenvalues). We must 
determine whether there is a third. Reducing M — 21: 


16 -8 14 -8 4 -7 -8 4 -7 
-§ 4 -7 |}>] 16 -8 14 |>]} 0 0 O 
-32 16 —28 -32 16 —28 0 0 O 


we see that the equation (VM — 2/)v = 0 has two free variables (so the dimension of the eigenspace correspond- 
ing with 2 is two) and there are two linearly independent eigenvectors corresponding with the eigenvalue 2. 
Combined with an eigenvector corresponding with eigenvalue —6, we have a total of three linearly independent 
eigenvectors and M is diagonalizable. Yes. 


4b: Similar matrices have the same determinant, so we can solve for k via 


-4 3 20 &k 


5 rae ‘a 


which simplifies to 15 — 8 = —3k + 40, the solution of which is k = 11. 


a b 


d such that PA = BP. We found that k = 11 


5b: Answers will vary. We are to determine matrix P = 


previously, so we need to solve the equation 
a b 5 -2] | -3 -2 a b 
c d -4 3 /} | 20 11 c d 


_| —3a-—2c —3b-2d 
“| 20a+1le 20b+ 11d 


or 


Sc-—4d —-2c+3d 


Matching entries gives a homogeneous linear system of four equations in the four unknowns: 


Sa-—4b -2a+3b 


Sa —- 4b = -3a— 2c 
—2a+ 3b = -3b- 2d 
5c — 4d = 20a + 1lc 
—2c + 3d = 20b+ 11d 


solvable by row reduction: 


3 
8 -4 2 0 10 4 5 
—2 6 0 2 014 3 
a 
-20 O -6 -4 0 0 0 O 
0 -20 -2 -8 0 0 0 O 
c and d are free variables, so there are many solutions. One solution is c = 0, d = 5, which makes a = —1 and 
-1 2]. ‘ : . 3 -4 
b = —2. Hence P = 0 5 is one solution. For this choice of P, PA = BP = | 00 15 } 
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8a: Since similar matrices have the same determinant, eigenvalues, characteristic polynomial, and rank, we may 
explain their non-similarity by showing they differ on any one of these characteristics. For this particular pair, 
for example, they do not have the same rank. Matrix A has linearly independent columns (rank 2) while B has 
linearly dependent columns (rank 1). This is enough to show they are not similar. However, if we happened 


not to notice the differing ranks, we could also note that their determinants are not equal: |A| = | . : = 
‘ 5 -10 . or 
12-11 = 1 while |B) =| | A: = 40 - 40 = 0. Of course we could also note that their characteristic 


polynomials or eigenvalues differ, but these are likely more tedious to show. 


10a: Add the last two lines of the code in question 9. The result is 


[ 
[a — 9*rl1, — -1/9*r1 + 1/9*r2, c == r2, d == r1] 
] 
1 1 
so the solution is P = a gM + 92 
2 al 


n . . . = 5 " 
13a: Computing an eigenvector corresponding to A = 2: 


-2 -4 = —22 -ll ah -22 -l1 
aH 2 22 11 0 0 


so one eigenvector is | _» | Computing an eigenvector corresponding to 2 = —1: 
-H -u -l1 -ll | 1 1 | 
6 6 
u | =i 
= = | 11 11 0 0 
1 1 1 3 0 lay? (3)’ 0 
i i i = = = ea = 6 
so one eigenvector is “4 Letting P 9 -] |? MP 0 1 |s0 P M'P oy 
and therefore 
7 7 4 _(5\' _1_ (5)' 
wep (Qo pi-[ ‘19 0 [3 le 2-(3), =I ae 
0 -l 0 -l 2+2(2) 1+2(3) 
Section 6.1 


le: Answers will vary (depending on the steps taken to reduce M). Row reducing M to an echelon form without 
using row swaps produces U: 


4 24 24 ]4M:°9M:! 1 6 6 ~Mic+M2.—> Mp. 1 6 6 
1 -30 0O 1 -30 0 0 -36 -6 


We have arrived at an echelon form, so U is determined. Applying the inverse elementary operations to the 
identity matrix to find L: 
1 O | Mut™2:>2,) 1 0 | 4M:>m: | 4 0 
por SLT a | 


1 6 6 


wi 4 0 
We hence have the decomposition M = io | 0 -36 6 


| (which can be double-checked by perform- 


ing the multiplication). 
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2c: Answers will vary (depending on the steps taken to reduce M). Row reducing M to an echelon form using at 
least one row swap produces U: 


-7 3 Jin om $77 3 lap com | 3 5S Laan coo, | 3 75 
6 -10}*> —> | 3 -S | =5*|-7 3 | =5 "| -21 9 
8-5 ge <5 ee eee Ge ae 


0 —26 = 0 -26 
0 25 0 0 


—= 
—-8M,.+M3.>M3: 


7M),,+M>,—>Mp, | aa et . | 


We have arrived at an echelon form, so U is determined. Applying the inverse elementary operations to the 
identity matrix to find PL: 


10 0 — 2 M2;+M3, M3, | 0 0 -7M,,.+M):.>M); 1 0 0 3M>,>M),; o p 
0 1 0 — 0 1 O Se ae <b. oe — -3; gz  O 
00 1 0 a 1 8M,,+M3,>M3,: 8 = p | 3M oMs 8 = 1 
1 1 7 1 

meeom:| 2 3 © lowmowm.| 3 3 © 

— 1 0 O 2 0 O 

8 _25 1 8 _25 1 

3 78 3 3 78 3 


There is only one permutation in the row reduction, M,, << M>,., so P"! = P = 


0 1 0 
1 O O |. We hence 
00 1 


0 1 0 2 0 O 3° =5 
have the decomposition M =| 1 0 O i i 0 0 -—26 | (which can be double-checked by 
00 1], § -8 £],0 oO 


performing the multiplication). 


3e: M is not square and therefore does not have a determinant. If M were square, the LU decomposition would give 
the determinant very simply. det(LU) = detL- det U. Since L and U are triangular, their determinants are 
simply the products of the entries on the main diagonals. 


4c: M is not square and therefore does not have a determinant. If M were square, the PLU decomposition would give 
the determinant very simply. det(PLU) = det P - det L - det U. Since L and U are triangular, their determinants 
are simply the products of the entries on the main diagonals. Since P is a permutation matrix, its determinant 
is either 1 or —1, depending on whether it incorporates and even number or an odd number of swaps. 


5e: Row reducing M to an echelon form without using row swaps or row scaling produces U: 


4 24 24 


1 -30 0 0 -36 -6 


— 4 24 24 | 
—=7 


We have arrived at an echelon form, so U is determined. Applying the inverse elementary operations to the 
identity matrix to find L: 


— 


1 O | M@itW2:92;[ 1 0 
; 1 


We hence have the decomposition M = | 0 | pean 


0 -36 -6 | (which can be double-checked by perform- 


AI 
— 


ing the multiplication). 


7d: Since LU = M, we are solving the system 


1 0 0 3 -7 -2 0 
4 1 0 0 7 -ljv=] 6 
—2 2 1 0 O 2 14 
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1 0 0 0 3 -7 -2 
in two steps. First by solving | -4 1 0 r = | 6 | and then solving | 0 7 -!l r = w. The vector 
—2 2 1 14 0 0 2 
1 0 0 Wi 0 
w can be determined quickly by writing down the sytem represented by | 4 1 0 | | We) | =| 6 | and 
—2 2 1 W3 14 


using substitution: 


WI =0 
—4w, +w2 =6 
—2w; + 2w2+w3 = 14 


Substituting w; = 0 into —-4w; + w2 = 6 gives w2 = 6 and substituting w; = 0 and w2 = 6 into —2w; + 


T 
2w2 +w3 = 14 gives 12+ w3 = 14 or w3 = 2. Hence w = [ 0 6 2 | and we now need to solve 


3 -7 -2 al 0 
0 7 -!l v2 |=] 6 |, which can again be solved by substitution: 
0 0 2 2 


3y, — 7v2 — 2v3 = 0 
7V2 -v3 =6 
2v3 =2 


Starting with 2v3 = 2 we have v3 = |. Substituting v3 = 1 into 7v2 — v3 = 6 yields 7v2 = 7 or v2 = 1. 
Substituting v3 = | and v2 = | into 3v) — 7v2 — 2v3 = 0 yields 3v; = 9 or vy; = 3. Hence the solution of 


3 -7 -2 0 3 
-12 35 7 Iv=!] 6 is v=| 1 }. 
-6 28 4 14 1 


1b: The eigenvalues of M are determined by its characteristic equation: 


Section 6.2 


IM ~ an =| 44 3 


—! _ _ _ _ = 2 = 
3 ga [=o AY(-9 — A) +6 =A 4+ 1314+42=0 
which can be factored as (A + 7)(A + 6) = 0 giving eigenvalues —7 and —6. The magnitudes (absolute values) 
of the eigenvalues are 7 and 6, so —7 is the dominant eigenvalue (the one with greatest magnitude). 


2b: The eigenvalues of M are determined by its characteristic equation: 


19-A 12 


it~ a =| -28 -19-A 


|= 09~ 2y-19~ 2 4336 = #25 =0 

which can be factored as (A+5)(A—5) = 0 giving eigenvalues —5 and 5. The magnitudes (absolute values) of the 
eigenvalues are both 5, so there is no dominant eigenvalue (one with greatest magnitude). The two eigenvalues 
have the same magnitude. 


3a: The most immediate approximation of an eigenvector is either v5 or V6 itself. The simplest way to get a handle 
on the associated eigenvalue is to note that vg = Mvs, which should be approximately Avs. 


1894 
—3020 


y 


V6 = 


| 467 |_| -4674 
742 |~| 742a 


Matching entries, we see that 1894 ~ —467A and —3020 = 742A, and solving for A, A = ies = —4.056 and 


Aw = 3070 x —4.070. Either one of these estimates will do for our approximation. There is no reason to choose 
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one over the other save personal preference. As for an approximate eigenvector, we expect V¢ to be a better 
approximation than vs. In summary, we choose 


1894 
4065.) om | 


as the approximate eigenpair. 


4a: The code computes v; through vj, for us. Since the ratios (Vj9)1.1 : (Vio)21 and (W11)1,1 : (Vi1)2,1 are 5196345 : 
12990681 and 171478659 : 428694651, or approximately 1 : 2.49997 and 1 : 2.49999 it seems that the 


: 1 : : , 
eigenvector is converging to some multiple of 25 and therefore yes, it seems as if the method will converge. 
: 0.625 . : 
5a: The code computes v6 through Vjo9 for us. Since Vj99 = “4 and Ws, through Vog are multiples of this vector 


(1 or —1 times), the method has settled/converged on this direction as an eigenvector. To find the associated 


eigenvalue, we compute 
-12 -5 0.625 —2.5 
MVi0 =| 16 6 -1 |7] 4 


-2.5 |- _ ‘| 0.625 


Noting that | 4 a 


} the associated eigenvalue is —4 and we have the exact eigenpair 


Note: This compares well to the approximation from question 3a where the result was 


1894 
—4.065, | 3020 : 
: : : iis : are 0.62715 
Scaling the approximate eigenvector by 3555 gives (a scaled eigenvector approximation of about) 4 : 


Section 6.3 


la: The parallelogram in question is sketched below. 


ett Hep ret 
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By an analysis much like that in subsection Areas and eigenvalues, an argument can be made that this paral- 
lelogram is the image of the unit square (the square with vertices at (0,0), (1,0), (1, 1), and (0, 1)) under the 


3 : 1 2 0 3 : 
3 4 v, which maps | 3 and | | 2A | Leting P 


be the parallelogram and S$ the unit square, we have T(S) = P and therefore area(P) = | det Al: area(S) = 17-1. 
The area of the parallelogram is 17. 


linear transformation T,(v) = Av = 


1d: The parallelogram in question is sketched below. 


“NS 


s (16, 12) 


10 


0 2 4 6 8 10 12 14 16 18 


By an analysis much like that in subsection Areas and eigenvalues, an argument can be made that this par- 
allelogram is the image of the unit square (the square with vertices at (0,0), (1,0), (1,1), and (0, 1)) under 


: 4 8 4 : 0 4 1 8 
the affine transformation T(v) = Av + b = | 6 1 vt+ 5 | which maps | 0 | to | 5 } 0 to i | 
and 1 | to . Letting P be the parallelogram and S the unit square, we have T(S) = P and therefore 


area(P) = | det A| - area(S) = 44- 1. The area of the parallelogram is 44. 


4 - is exactly twice the area of the given 


3b: The area of the parallelogram determined by the columns of 


-l 6 


triangle. By question 2, 4 


det | = 22 is the area of the paralellogram, so the area of the triangle is 11. 

6a: Because M and U are similar, they have the same determinant and eigenvalues (see Section 5.4). Because U is 
triangular, (i) det U is the product of the entries on its diagonal, (—4)(5) = —20 = det M; and (ii) the eigenvalues 
of U are the entries on its diagonal: —4 and 5 are the eigenvalues of M. 


7a: Adding the lines 


print¢("M =" 

print(M); printQ 

print("det M =",M.detQ) 
print("eigenvalues are",M.eigenvaluesQ)) 


to the SageMath code produces 


M = 
[-13/37 54/37] 
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[495/37 50/37] 


det M = -20 
eigenvalues are [5, -4] 


Hence det M = —20 and the eigenvalues of M are —S and 4 as determined before. 


lla: Following the algorithm requires finding an eigenpair, so we find an eigenvector of M corresponding to eigen- 
value A = —3: 


16 —8 | -ZAi+42:>42;[ 16 -8 
m—ar=l 94 a _ EF fal 


1 1 0 
2 | veseto=| > 41 


is a 1 x | matrix, it is already upper triangular, and we can select R = [1]. Hence we have 


so eigenvectors 7 | satisfy 16v,; — 8v2 = O or v2 = 2v,. Using eigenvector v = 
2 
: 1 
Since (Q MQ), , 
oa\ i and therefore P= 00 =| 5 al 


2 1 
Note that 


which is upper triangular. 


lle: Following the algorithm requires finding an eigenpair, so we find an eigenvector of M corresponding to eigen- 


value A = 4: 
10 10 
~38 -50 -24 $43, +42: Ad,; “ao 0 - FAL: PAL: 
M-aAl=| 37 49 24 | °° —> 7. 3 
-26 -35 -18 |~34%*4u>4n | 26-35-18 | 742 7Ae 
1 1 0 i lly le 0 0 O 
1 1 0 = 1 1 O 
-26 -35 -18 |****"4=1 9 9 -18 
v1 2 2 0 0 
so eigenvectors] v2 | satisfy v2 = —vy and v3 = 4v,. Using eigenvector v = —2 |wesetQ=| -2 1 OO}. 
V3 1 1 01 
1 


is a 2 X 2 matrix, it must be calculated and triangularized. By row reduction, Q = 


Since (Q-'MQ) 


0 
i 
2 


0 
1 O | and so 
0 1 


\L1 


Let bd Le 


5 0 O]f -34 -50 -24][ 2 0 0 4 -25 -12 
QO'mo=| 1 1 0 37 53 24 —2 1 0}f=|0 3 0 
-; 0 1]| -26 -35 -14}{ 1 0 1 0 -10 -2 
4 3 0 ‘ : : . . 
and (Q MQ), al | ee must now be triangularized (applying the algorithm as done for question 


lla). The eigenvalues of (o-'M Q) are 3 and —2. Choosing to find an eigenvector for eigenvalue 3, 


\L1 
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0 0 1 1 O 
—1 _ == . . . . n 
(Q MQ), 3] = 10 -5 | so a corresponding eigenvector is _9 |: Hence 2 1 triangular: 
izes (2'mo),, ‘ and we set R = = : . Returning to the triangularization of M, 
1 0 1 0 0 2 0 0 1 0 0 
O-| 5 e | 0 1 O|} and P=QO=|] 2 1 04/0 1 O 
0 —2 1 1 oO 1 0 2 1 
2 0 0O 
Multiplying, we find P=] -2 1 O 
1 -2 1 
Note that 
2 0] '[ -34 -50 -24][ 2 0 0 
P'MP=| -2 1 O 37 «53 «24 |} -2 «1 «OO 
1 -—2 1 -26 -35 -14 1 2 1 
5 0 0]f[ -34 -50 -24][ 2 0 0 
=| 1 1 0 37 53 24 —2 1 O 
2 2 1 j| -26 -35 -14 Le 7 


-17 -25 -12 2 0 0O 
=| 3 3 0 —2 | Of= 
3 <4. 2 1 —2 1 


which is upper triangular. 


Section 6.4 


1b: This question is requesting the best approximation of (the point) p = | ° within the subspace W = span {| Ey } 
(the collection of all multiples of v). By theorem 18, the answer is the projection of p onto W: 
p-v 12(7) + 1(-6) _ 78 


ProJwP = projyp = we = 77) + (6-6) = 35" 


2b: The distance between a point and line is the distance between the point and the nearest point on the line (the best 
approximation of the point on the line). The line y = ix is shorthand for the set of points ;(x, y) : y = ix , OF as 


vectors, {| - :y=43h=] - |:xinx} = { +] : xin} = span ; \ = span{| ; \ Hence 
bi a 5 5 
we seek the best approximation of (the point) p = 2 within the subspace W = span : \ (the line 


y= ix). By theorem 18, the nearest point to p is the projection of p onto W: 


| <i 
= 2 
= I 


The distance between the point and line is then the distance between the point and the best approximation, the 
norm of their difference: 


-3 |_ ~ 35 _|f 2 || 1 | 17 |). N17? +85? _ v7514 
oA -B |} ~ 96 |I| 85 || ~ 26 ~ 26 


35) 41) 19 
ProlwP = 505) 4 1d) 8 26 


3 
1 


| 
Sl 
___ 


= 3.334 
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3e: By definition, the projection of a vector (as the given v) onto a subspace is the sum of its projections onto an 
orthogonal basis (as the given 8) of the subspace. 


9 -1 9 —3 
-12 |-] -4 -12 |-| 2 
Petey Lele 
a — + SSS ers 
aay ste 
-4 |.| -4 
5 5 
-1 -3 1 9 
ae i=] -2 
42 14 1 5 
Note: This could have been determined without calculation. We have a basis with 3 elements in R?, so 


span8B = R? and therefore v is in span8. The vector is in the subspace in question and so does not need 
to be approximated. It can be represented exactly as a linear combination of any basis. See question 10. 


proj spanB v= 


4b: span8 equals the span of the column space of [8], which can be determined by row reduction: 
0 6 6 3 -l -4 3 -l -4 3 -1 -4 
-9 -16 -7}>] -9 -16 -7}>]| 0 -19 -19}>/]0 1 1 4. 
-6 2 8 0 6 6 0 6 6 0 0 0 
The first two columns have pivot positions, so the first two vectors of 8 form a basis for span8. In order to 


0 6 
project onto span8, however, we need an orthogonal basis. Letting b; = -} | 3 | = and by = 5 | —16 | = 
2 2 


3 0 3 
: —22 
b2 — proj, bz = 0 |-SEf3-| 
2 is 
3 39 0 39 
13 -# =| —38 |, so we may use yw; =| 3 |,w2 =| —38 |? for an orthogonal basis. Finally, 
oe 57 2 57 


SHI,» 
Bee 


0 39 1578 6.6025 
= ae 3 | + a —38 |= aa —1703 |=} —7.1255 
2196 9.1883 


5b: By definition v is in W* if it is orthogonal to all vectors in subspace W. Since 


8 
-3 |:a,binR}, 
-4 
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-7 8 
we may represent a particular but arbitrary vector w of W by w = | O |+b} -3 | for some real numbers 
0 —4 
aand b. Then 
-7 8 0 -7 0 8 0 
w:v=la]} O |+b| -3 |}-] -8 |=a] O |-} -8 |} +5] -3 |-] -8 
0 —4 6 0 6 —-4 6 


= a(—7(0) + O(—8) + 0(6)) + b(8(0) — 3(—8) — 4(6)) = 0. 
So v is orthogonal to all elements of W and is therefore in W+. 


Note: In retrospect, showing that v is orthogonal to all elements of the given basis of W (shown as part of the 
computation above) is sufficient. 


0 0 
6: We have determined above that v is in W+, so one decomposition is v = 0+] —8 0 is in W and| —8 | is in 
6 6 


W+ as required. (And by theorem 16, this is the only such decomposition). 


7a: By row reduction, the solution of the system: 


-6 3 -15 7 i i Sal -6 3 -15 7 
14 -7 35 1 — “| 14 7 35 1 


-18 9 -45 6 0 O 0 -15 


The system is inconsistent! We must find a best approximation. According to theorem 19, we need to project 
the constant vector onto the column space of the coefficient matrix. Continuing the row reduction begun above, 
but without the rightmost column: 


TA, +A2,Ad; -6 3 -I15 
pen tite o : 
0 O 0 


The only pivot position of the coefficient matrix appears in the first column, so the first column of the coefficient 


7 —-6 
matrix provides a basis for the column space. We therefore project | 1 | onto | 14 
-15 


14 14 
62 + 142 + 18? 278 


1(—6) + 114) = al - - =| -° | 
=8 A 


and a best approximation of the solution is x = oe y=0,z=0. 


Note: Answers may vary. Any linear combination of the columns of the coefficient matrix that sums to 


—6 
oT | 14 | will do. There are infinitely many of them. 


-8 


8b: By theorem 16, every vector of R* can be written as a sum of a vector in W and a vector in W+. Hence if B is a 
basis for W and B+ is a basis for W+ , it must be that 8 U B+ spans R*. That B U B- is linearly independent 
(and therefore a basis is left as an exercise—see exercise 12). In the case of this question, 8 is given with 
one element meaning 8+ must have two elements. We seek two linearly independent vectors orthogonal to 


7 -11 0 
We on 11 | A quick computation will show | 7 | and | —12 | are orthogonal to W (do you see 
12 0 11 


-11 0 

where they came from?). Since | 7 | and | -12 | are additionally linearly independent, they form a basis 
0 11 

for Wt. 
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13¢: 


14¢e: 


This question is requesting the best approximation of (the point) p = x° — 11x? — 9x + 10 within the subspace 
W = span {12x3 -—x- 5} (the collection of all multiples of q). By theorem 18, the answer is the projection of p 
onto W: 

<p, q) —4132 2066 


Proj wP = projgp = i. a! = 3939 I= ~Iog30 


According to the solution above, the best approximation is —2°°q. Distance is measured as the norm of the 


1083 
: ‘oh i 2066 ; 
difference, which is ||- T0834 — p||: 


||2066q + 1083pl| = 


| 2066 | I ||25875x° — 1191327 — 11813x + 500|| 


1 
~ 10832 Pil = 1083 1083 
sai ssa, 74g = oe 


- ~ 4.664 
1083 1083 as 


Section 7.1 


3a: The coefficient matrix and constant vector for the linear regression problem are 


x1 ai fe 141.1 

x2 eo x5 —35.51 
M=|... . . | andb= : 

1 xg i xe 1783 


According to the code at £5) Sagelath Cell] 128 M is (with whitespace changes only) 


[ 1 389/1000 151321/1000000 58863869/1000000000] 
[ 1 851/1000 724201/1000000 616295051/1000000000] 
[ 1 2467/1000 6086089/1000000 15014381563/1000000000] 
[ 1 4113/1000 16916769/1000000 69578670897/1000000000] 
[ 1 181/40 32761/1600 5929741/64000] 
[ 1 6639/1000 44076321/1000000 292622695119/1000000000] 
[ 1 8873/1000 78730129/1000000 698572434617/1000000000] 
[ 1 281/25 78961/625 22188041/15625] 


The code proceeds to calculate the normal equations M7 M¥ = Mb and solve for ¥, the regression coefficients: 
| Bo Bi Bo B3 |=| 121.697796 -65.6594829 10.8873569 0.744289915 li 
Hence the best fit model is 
f(x) = 0.744289915x° + 10.8873569x7 — 65.6594829x + 121.697796. 


A plot of f(x) superimposed on a scatterplot of the data is shown below, demonstrating geometrically the 
goodness of fit. 
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Note: Unlike with the method of projection, it is not critical that the data are entered exactly nor that the 
computation is done using exact arithmetic. Replacing the first two lines of the code by 


x = vector([.389, .851,2.467,4.113,4.525,6.639,8.873,11.24]) 
fx = vector([141.1,-35.51,167.1,18.3,173,243,1039,1783]) 


(leaving the rest untouched) and running the code results in nearly the same solution. It will differ slightly 
due to roundoff, but it is successful! Besides the simplicity of this method as compared to projection, being 
able to do the calculation using decimal approximations (floating point) is necessary for efficient computer 
implementation. Most programming languages do not provide exact computation, and those that do (like 


SageMath) are much slower. It is impractical to require exact arithmetic. Try it for yourself at £3) Sagelath Cell] 
129. 


3e: The coefficient matrix and constant vector for the linear regression problem are 


a xt ae —1.36 

x Xaty i 8.17 
M=| . . . | andb= 

a x4t4 1 50 


According to the code at 3) SageMath Cell BROR Ets 


[1681/1000000 12833/500000 97969/250000] 
[23409/250000 47889/250000 97969/250000] 
[ 11449/10000 33491/50000 97969/250000] 
[ 2304/625 3756/3125 97969/250000] 
[1681/10090000 4141/100000 10201/10000] 
[23409/250000 15453/50000 10201/10000] 
[ 11449/10000 10807/10000 10201/10000] 
[ 2304/625 1212/625 10201/10000] 
[1681/1000000 6273/100000 23409/10000] 
[23409/250000 23409/50000 23409/10000] 
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[ 11449/10000 16371/10000 23409/10009] 
[ 2304/625 1836/625 23409/10000] 
[1681/1000000 6683/100000 26569/10009] 
[23409/250000 24939/50000 26569/10000] 
[ 11449/10000 17441/10000 26569/10000] 
[ 2304/625 1956/625 26569/10000] 


The code proceeds to calculate the normal equations M? M¥ = M'b and solve for ¥, the regression coefficients: 


[ Bo Bi Bo |=| 4.88443881609 3.4469959515 8.28520093268 |" 
Hence the best fit model is 
f(x) = 4884438816092? + 3.4469959515.x1 + 8.285200932687". 


Note: Unlike with the method of projection, it is not critical that the data are entered exactly nor that the 
computation is done using exact arithmetic. Replacing the first three lines of the code by 


x = vector([.041, .306,1.07,1.92]) 

t = vector([.626,1.01,1.53,1.63]) 

ktx = vector([-1.36,8.17,6.59,24.1,8.47,8.88,16.9, 38,18.9, 
19,29,46.1,21.4,28.9,36,50]) 


(leaving the rest untouched) and running the code results in nearly the same solution. It will differ slightly 
due to roundoff, but it is successful! Besides the simplicity of this method as compared to projection, being 
able to do the calculation using decimal approximations (floating point) is necessary for efficient computer 
implementation. Most programming languages do not provide exact computation, and those that do (like 


SageMath) are much slower. It is impractical to require exact arithmetic. Try it for yourself at £3) Sagelath Cell] 
131. 


4a: The coefficient matrix and constant vector for the linear regression problem are 


1 x x x 141.1 

1 x x x5 -35.51 
M=|... . . | andb= : 

1 xg Br % 1783 


According to the code at £3) Sage ath Cell] 132 M is (with whitespace changes only) 


[ 1 389/1000 151321/1000000 58863869/1000000000] 
[ 1 851/1000 724201/1000000 616295051/1000000000] 
[ 1 2467/1000 6086089/1000000 15014381563/1000000000] 
[ 1 4113/1000 16916769/1000000 69578670897/1000000000] 
[ 1 181/40 32761/1600 5929741/64000] 
[ 1 6639/1000 44076321/1000000 292622695119/1000000000] 
[ 1 8873/1000 78730129/1000000 698572434617/1000000000] 
[ 1 281/25 78961/625 22188041/15625] 


The code proceeds to orthogonalize the columns of M and then project b onto its column space. Using exact 
arithmetic as SageMath does, the projection is too long to print. It is approximately 


| 97.8 74.2 37.2 87.6 116 383 916 1816 ih 


This is the best approximation of b within the column space of M. Therefore, the regression coefficients (in the 
order Bo, 81, 82, 83) are given by the solution of the system 


Mv =| 978 74.2 37.2 87.6 116 383 916 1816 l' 
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which according to the code (again approximately since the exact results are too long to print) is 
| Bo Bi Bo B3 |=| 121.697796 -65.6594829 10.8873569 0.744289915 ‘i 
Hence the best fit model is 
f(x) = 0.744289915x> + 10.8873569x" — 65.6594829x + 121.697796. 


A plot of f(x) superimposed on a scatterplot of the data is shown below, demonstrating geometrically the 
goodness of fit. 


Note: It is critical that the data are entered exactly and that the computation is done using exact arithmetic. 
Replacing the first two lines of the code by 


x = vector([.389, .851,2.467,4.113,4.525,6.639,8.873,11.24]) 
fx = vector([141.1,-35.51,167.1,18.3,173,243,1039,1783]) 


(leaving the rest untouched) and running the code results in an inconsistent system, due to roundoff error. The 
parameters are unsuccessfully calculated since the approximated projection of b is not in the column space of 


the approximated M. The rest of the calculation works perfectly well. Try it for yourself at £3) Sage ath Cell] 
ee 


4e: The coefficient matrix and constant vector for the linear regression problem are 


i xt i —1.36 

XG Xaty i: 8.17 
M=| . . . | andb= 

x x4t4 mi 50 


According to the code at ) SageMath Cell BRUM Bt 
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[1681/1090000 12833/500000 97969/250000] 
[23409/250000 47889/250000 97969/250000] 
[ 11449/10000 33491/50000 97969/250000] 
[ 2304/625 3756/3125 97969/250000] 
[1681/1000000 4141/100000 10201/10000] 
[23409/250000 15453/50000 10201/10000] 
[ 11449/10000 10807/10000 10201/10000] 
[ 2304/625 1212/625 10201/10000] 
[1681/1000000 6273/100000 23409/10000] 
[23409/250000 23409/50000 23409/10000] 
[ 11449/10000 16371/10000 23409/10000] 
[ 2304/625 1836/625 23409/10000] 
[1681/10090000 6683/100000 26569/10009] 
[23409/250000 24939/50000 26569/10000] 
[ 11449/10000 17441/10000 26569/10009] 
[ 2304/625 1956/625 26569/10000] 


The code proceeds to orthogonalize the columns of M and then project b onto its column space. Using exact 
arithmetic as SageMath does, the projection is too long to print. It is approximately 


| 3.3 44 11 25 86 10 18 33 20 21 31 48 22 24 34 51 l' 


This is the best approximation of b within the column space of M. Therefore, the regression coefficients (in the 
order Bo, 81, 82) are given by the solution of the system 


My =| 3.3 44 11 25 86 10 18 33 20 21 31 48 22 24 34 51 if 


which according to the code (again approximately since the exact results are too long to print) is 


| Bo Bi Br | =| 488443881609 3.4469959515 8.28520093268 l' 
Hence the best fit model is 
f(x) = 4.88443881609x7 + 3.44699595 15.1 + 8.285200932687. 


Note: It is critical that the data are entered exactly and that the computation is done using exact arithmetic. 
Replacing the first three lines of the code by 


x = vector([.041, .306,1.07,1.92]) 

t = vector([.626,1.01,1.53,1.63]) 

ktx = vector([-1.36,8.17,6.59,24.1,8.47,8.88, 16.9, 38,18.9, 
19,29,46.1,21.4,28.9,36,50]) 


(leaving the rest untouched) and running the code results in an inconsistent system, due to roundoff error. The 
parameters are unsuccessfully calculated since the approximated projection of b is not in the column space of 


the approximated M. Try it for yourself at .) SageMath Cell ARES 


5a: The sum of the squared errors equals ||M¥ — bi? = (MV — b)- (MV — b). Adding the lines 


print(); print("Sum of squared errors:") 
print ((proj-£x)* (proj-£x)) 


to the code from the solution of question 3a (as seen at £3) Sagelath Cell] 136) gives a value of 


\|M¥ — bl|? + 74685.25 


5e: The sum of the squared errors equals ||M¥ — bll’ = (MV — b)- (MV — b). Adding the lines 
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print(); print("Sum of squared errors:") 
print ((proj-ktx) * (proj-ktx)) 


to the code from the solution of question 3e (as seen at £3) Sagelath Cell] 137) gives a value of 


\|M¥ — bl|* ~ 125.160568 


Section 7.2 


1b: A transition matrix must be (i) square, (ii) have nonnegative entries, and (iii) have columns that sum to |. The 
matrix 


0.908 0.036 0.807 
0.002 0.278 0.485 


has qualities (i) and (ii), but not (iii) since the third column does not sum to |. Therefore, M is not a transition 
matrix. 


0.09 0.686 0.168 
M= 


Section 7.3 
4b: In C((0, Z]), 
(cos (m1) .cos (m=1)) = in cos” (m=) dt = : fi + cos (2m=)) dt 
l L 


L 1 L 
pe ge (2m) Fees E a Qmz)| = 
2 2mn L /\o 2 2m 
Note that sin(2mz) = 0 for any integer m. 


Nib 


5b: From question 4c, in C({0, 1]) (sin (mat) , sin (mat)) = 5. Therefore 


1 _ 1 
bm = 2¢f, sin (mnt)) = 2 if tsin (mat) dt = = f cos (mnt) — Mod oo) 
0 mn mr 


0 
a) 2 > 

— Cos (mz) = —— (-1)” a ytins (-1)""! 

mn mrt i 


and the Fourier sine series is 


1 2 
— sin (at) — — sin (2at) + — sin (3at) ---:- 
og 1 37 
Notes: 


1. sin(mz) = 0 for any integer m 


2. f t sin (mat) dt can be calculated using integration by parts: 


t 1 
[esin (mnt) dt = -—— cos (mnt) + — [ coscman dt 
mn mn 


t 1 
= —— cos (mat) + 3 sin(mzt) = 
mn (mn) 


ls 
— q cos (mat) — — sin (mnt) 
mn mn 


6b: From question 4c, in C([0, 1]) (cos (mat) , cos (mat)) = 5 and from question 4a, (1,1) = 1. Therefore, 


1 dil 1 
a= (f= [ ran =| 52 =5 


0 
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and for m = 1,2,... 


1 1 
2 1 
Am = 2(f, cos (mnt)) = 2 tcos (mat) dt = — |t sin (mat) + — cos (mnt) 
0 mr mn 


0 
2 2 —4 
; [cos (mz) — 1] = 3 [(-1)”-1] = 3 for m = 1,3,5,... 
(mz) (mr) (mm) 
and the Fourier cosine series is 
4 4 4 
a) cos (zt) — oe cos (37) — 75,2 cos (5zt) —--- 


Notes: 


1. sin@mz) = 0 for any integer m 


2: i tcos (mat) dt can be calculated using integration by parts: 


t 1 
{ tcos (mat) dt = — sin (mat) - — { sin(mat) dt 
mn mn 


. 1 
t sin (mat) + —— cos (mat) 
mn 


t 1 1 
= — sin (mat) + are cos(mzt) = — 
mn (mr) mn 


7a: From question 4, in C({0, 1]), (cos (mt), cos (mat)) = (sin (mat) ,sin(mat)) = 5 and (1,1) = 1. Combined 
with the fact that the functions 1, cos? (mt), and sin” (mat) are even, in C ({-1, 1]) (cos (mat), cos (mrzt)) = 
(sin (mat) , sin (mzt)) = 1 and (1, 1) = 2. Therefore, 


1 17 are 
a= 3(f=5 f tdr= 5 i, =1 


and for m = 1,2,... 


1 
dm = (f, cos (mnt)) = { cos (mat) dt = in (mat)|!, =0 
if mn 


1 
bm = (f, sin (mat)) = i sin (mat) dt = 0 (because sin (mat) is odd) 


-1 


and the Fourier cosine series is 


1 
Section 7.4 
la: Though SageMath is not strictly required, it makes repetitive calculations like this more manageable. 
af) = 1] 16 26 0 A 1/7 |_| 1/7 | | 0.14285 
mh SA |G “2 23-|| G 3/14 |~ | 3/14 |*| 0.21428 }’ 


ee ee 1/7 || 1/7 |_| 55/294 |_| 0.18707 
Be TU "03 | 65 <9 ||| Sia 3/14 |=] 61/588 |~| 0.10374 | 
7 Head 16 26 |[ 55/294] | 1/7 |_| 239/1764 | _ | 0.13548 
eee! = FS | 65 23 || 61/588 3/14 |~ | 821/3528 |~| 0.23270 |’ 
; fet 16 26 |[ 239/1764 1/7 |_[ 2071/10584 |_| 0.19567 
eee We gs 23: ||| BI /3528 3/14 |~ | 1741/21168 | ~ | 0.08224 
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2a: A fixed point of a function f(x) is a solution of the equation f(x) = x (where the input and output of the function 
are equal). Iteration beginning at the fixed point stays at the fixed point. Solving for the fixed points: 


1/7 


= 9/14 


=x 


1[ 16 26 
42| 65 23 


i]t 26}... .[ U7 
42| 65 -23 ~~ | 3/14 


1[ 16 26 1ol,. [iy 
q2| 65 -23 || 0 1 |J*~ 7] 3/14 


42} 65 -65 3/14 
so fixed points are solutions of 
-26 26 —-6 
65 -65 |*~| -9 | 


an inconsistent system. f has no fixed points. 
4a: The dynamical system has no fixed point, so it cannot have an attractor (an attractor is a fixed point). 


5a: The solution of question la indicates the orbit is not constant. The answer to question 3a gives eigenvalues 


| 16 26 
° if 
of —% and 1 so the spectral radius of 73] ¢< 5, 


tral radius greater than one and an orbit that is not fixed, the dyamical system will tend toward infinity 


| is greater than one e in this case). With a spec- 


: 7\k 
(a approximately the rate (2) ) 


5j: With a spectral radius less than one (5 in this case), the fixed point will be an attractor, and the dynamical system 
will tend toward the fixed point. 


Section 7.5 


1b: Answers may vary. The figure is scaled by 5 in both the horizontal and vertical directions, either reflected about 
the y-axis or rotated 90°, and then translated into place. To be more specific, these general observations lead to 
two possibilities (and there are others): 


1. scale by 5, reflect about the y-axis, translate | ; ; As a formula, 
_{| -l 0 1/2 O 4 
FR=1 9 || 0 1/2 |** a 
_| -1/2 0 % 4 
“| Qo 1/2 |**] 0 
2. scale by 5, rotate 90° (counterclockwise) about the origin, translate 4 } As a formula, 
_{| 0 -l 1/2 O 4 
O=)4 0 | 0 1/2 |** a 
_ QO -1/2 ey 4 
“1 1/2 0) 0 
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3: One of the easiest ways to show that a shape tessellates the plane is to show that it tiles a parallelogram or a 
triangle. That tiling can then be repeated to tessellate the plane. In this case, however, that is not an option. 
A dissect and inflate, dissect and inflate, dissect and inflate,... procedure can be used instead. The inflation 
is performed so that one part of the dissection becomes the whole (and the other parts lie outside the whole). 
Which part becomes the whole rotates among the parts. In pictures: 


0 2 4 6 8 10 12 14 «416 «18 © 20 
14 14 
12 12 
dissect 1 inflate nt 
— — 
8 8 
6 6 
4 4 
2 2 
0 2 4 6 8 10 2 14 «6 18 20 0 2 4 6 8 10 2 14 «46 «18 © 20 


dissect inflate 


10 
8 
6 
4 
2 
ie 18 = 20 0 2 4 6 § to 12- 44 ete do 
14 
12 
dissect 4 inflate 
—_ > eee 


9d: The IFS contains an affine transformation for each part of the dissection. Each transformation maps the whole to 
one of the parts. 
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1 
scale 5 
—7 
reflect about x-axis 
translate (0,2 V3) 


scale 5 
= 
rotate 120° 
translate (3, V3) 


+ 


scale 5 
= 
reflect about x-axis 
rotate — 60° 
translate (3, V3) 


Formally, the IFS is 


1 1 0 
{rw =| 6 1 |x 79 =| 6 “ xt+ 23 | 
ee 1 2] f 3 
T3(X) = x+ | T4(x) = | x+ | 
ei)" bw 2 ils 


Note: There are different possible literal descriptions of the transformations, but the IFS itself is unique. 


10: Taking the literal descriptions from the solution of exercise 9d and translating them to the required sequence: 
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1 
scale 5 
2 none, 5, 0, 0, 0 

— 


scale 5 
— . I 7 
reflect about x-axis | *@%!S> 2° 0, 0, 2 V3 ~ 3.46410 
translate (0, 2 V3) 
scale 5 
oer: none, 4, 120, 3, V3 ~ 1.73205 
translate (3, V3) 
scale 5 
—" 


reflect about x-axis | x-axis, $, -60, 3, V3 ~ 1.73205 
rotate — 60° 


translate (3, V3) 


A screenshot of this information in the rep-tile designer shows the correct rep-tile, verifying that the transfor- 
mations are correct. 


Tea Time Linear Algebra Rep-tile Designer — Mozilla Firefox 


Home Page Tea Time Linear Algebra Re; x { 


Iqbrin.github.io/te 


Rep-Tile Designer 


1 2 3 4 5 6 7 
Reflection: none ~||x-axis v [none v axis v||none »|none v none v n 
Sealeratio: 5 <|[5 <=\[.5_ sll Se : 
Scale 4 5000000.5000000.5000000.500000 
factor: as aduiita s 7 
Rotation: © ||0 | 120 -60 ] } 
Horizontal 1 : : 
Translation: ° {© }13 3 
Vertical ; - a 
\|3 1 732 7320 
Translation: o | |3.4641 > | |1.7320 1.7326 


Dissection level: |1 v | Fixed points: | off » 


11: Each part is labeled with its scale factor in the following diagram. Each of the labeled sides generates one 


equation involving the scale factors by the fact that the lengths of the two parts of each side must sum to the 
length of the whole side. 
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2.683 Vv 138 


J 13 55, + V1359 =5 


2.652 + 582 = V13 


2.653 + V135> = 2.6 
_——. S89 


5S] V 1389 


In matrix form, the system of equations is 


5 vB 5 5 5 
0 8  vI3 |) s2 |=] v3 
0 vi3 8 53 8 


and can be solved using the code at @ Sead? 138. The solution is 5; = 2, SQ = =e 53= R. Note: the 


code also verifies that the sum of the squares of the scale factors (s? + 2s; + s3) is 1, evidence that the solution 
is correct. 
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Answers to Selected Exercises 


Section 1.1 2m: undefined 
5b: 1x6 3c: 27 
St: 2x2 3g: 77 

1 

1 3j: -43 

0 
6d: _8 4e: 27 

-5 

-3 4g: 77 

-7 2 1 8 4j: -—43 

-8 -ll 10 -6 
6h: 9 -1l 3 -6 S: 2x5 

6 -9 3 4 

2 -4 10 -7 8: (a) true (b) true 
9: (a) C = matrix(3,2,[-4,-9,13,-11,7,5, 


-14,-2,-12,12,8,11]) (b) print(c) (c) . 
entry = C[3,2] (d) print(entry) Section 1.4 


le: 109 


Section 1.2 th: 30.42 = 5.51543289325507 


—2 -12 14 -3 


le: -8 -§ 3 -5 1j: 3 V15 
4: Yes. Explain. 2e: 2 ¥34 


: 2h: V6.81 ~ 2.60959767013998 
Section 1.3 


2j: V91 
If: —23 
Ig: 15.26 oe. yes 
i 11 6 3h: no 
Fl 11 6 
3j: yes 
18 9 3 
- | 6 11 -23 | af: k=—14, 6 
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Section 1.5 


1d: Cr2 = det(8) 


If: Co. =- de( " - 

2d: 44 

2h: 97 

21: 273 

4: (a) true (b) false (c) false (d) true (e) false 


10: It is not possible to write any row as a linear com- 
bination of the other two in any of the remaining 
seven blocks. 


Section 1.6 


1d: [ x | 
_[ 1. -v2 
1g: _ 22 5 
3 3 
1 2 O 
In: |} —2 6 -3 
1 -1 1 


1p: Not possible. 


13-8 5 1 
‘(6-8 42 
ol > ee ST 

a. <4 2° 4 


Section 1.7 


1b: -3 

If: 8 

2e: A27+21a 

2e: 22-227 -A+2 


3b: —2,3 
3f: -1,0 

3j: -24iv2 
3k: -1,0,1 


4b: any vector of the form | : ; 


4 
4e: any vector of the form | -3-iN3 | 


1 
4h: any vector of the form | 2 
1 


6: (a) false (b) false (c) true (d) false (e) false (f) true (g) 


false 
-1 
9: The eigenvector is} _ 6 and the associated eigen- 
—2 
value is 9. 
Section 2.1 
3 2 -8 9 
1b: 0 -3 2 10 
-7 1 0 -ilil 
3 2 -3 -7 
Id: | -5 1 -2 -8 
1 1 1 StI 
-14y, - 15% + = 87 -8 
2b: -13y, + 2% - vy = 13 
15v1 = Ov. = 63 = 12 
2d: 
10v, — 9v2 — v3 + 3v4 + 15v5 = 6 
-lly, + 12v2 + 13v3 + Sv4 = Avs =-2 
5: One solution is vy) = —5, v2 = 4, v3 = 0. It has in- 


finitely many more because both v; and v2 can be 
written in terms of v3, and v3 can take any value: 


yy =—-5-2v3 and wy =4- 373 
The example solution comes from taking v3 = 0. 
6b: Swap rows 2 and 3 


6d: Row 2 replaced by row 2 minus 9 row 3 


3 -t 1 
a E -10 30 2A 
9 9 -4 
8d:| 5 6 -2 
9 9 ~6 


—5 9 -6 
8f: |} 22 -10 -44 29 

-8 3 2 2 
9d: Swap rows | and 3 


9e: Scale row 3 by 3 
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10b: Replace row | by row | minus 2 row 2: 


1 0 0 
2 3 -2 


11b: Replace row 2 by row 2 minus 2 row 1: 


14: 


1 0 -3 

03 4 
Each coefficient of the linear combination that gives 
the second row of AB is the same multiple of the 
corresponding coefficient of the linear combination 


that gives the first row of AB, so the second row 
must be that multiple of the first row: 


(AB); = Ai Bi, + Ai2B2,; + +++ + AinBn, 

(AB). = kAi 1 By, + KAi 2Bo. + +++ + kKAi Br: 
= k(A, 1B: + Ai2Bo; +++ + AtnBn:) 
= K(AB),. 


Section 2.2 


3g: 


4d: 


4f: 
4j: 
4: 


Answers will vary. Any answer where x3 = 4x4 and 
xj = —2x) (but not all zero) is correct. For exam- 
ple, x; = —3, x2. = 5, x3 = 4, x4 = 3 is one solution. 


x=-6,y=-4% 


4 1 8 
XM] = 73, %2 = 735, 4X3 = 7 


Xx, = —92, x2 = —42, x3 = 32, x4 = 77 


Section 2.3 


le: 


1d 


th: 
2d: 
2f: 


3e: 


(i) yes (ii) consistent (iii) 1 
(i) no 


(i) yes (ii) inconsistent (iii) 0 


(i) yes (ii) consistent (iii) infinitely many 
(i) not in reduced row echelon form 
(i) in reduced row echelon form 
al 4 9 
ia v2 _ 3 -l 
(ii) ad +r 1 
V4 2 0) 
val 1 
. oe _ 1 
(i) yes (ii) | v2 =r} -3 
V3 1 


3d: 


4e: 


4f: 


5b: 


10: 


12: 


13: 


VI -1 6 
v2 |=r] 1 |+s] 0 
V3 0 1 
abl- 
r 1 S 2, 
32 ~3 
(a) yes, yes (b) infinitely many, infinitely many (c) 1, 


1 (d) no (e) yes 


The coefficient matrix will have more columns than 
rows, and therefore the system will have at least 
one free variable. 


Yes. For example 


3x + 2y = 7 
x - By = -5 
Sx - 4y = -3 


has solution x = 1, y = 2. 


Section 3.1 


le: 


li: 


Answers will vary—examples will be different as 
they are creations of individuals, and explana- 
tions will be different as they are requested as 
informal explanations, likely based on intuition 


rather than definition. Let r = 3, s = 2 and 

A = . = Then (rs)\A = (2:3)A = 
6.=1 652 -6 12 

ras | 6-4 6--5] ~ & 250." 
Os =I) 222 

r(sA) = 3(2A) = ( a a = 

alice 4 _ qe=0 354 5 

8 -10 > 3-8 3--10 - 
oO EE | eernleiet tially b 

24 ~30 F e rule 1s true essentia y ecause 


the associative rule for multiplication of real num- 
bers applies to each entry. 


Answers will vary—examples will be different 
as they are creations of individuals, and ex- 
planations will be different as they are re- 
quested as informal explanations, likely based 


on intuition rather than definition. Let A = 
3 -4 -9 19 1 
as ad eed Then 
7 
3 -4 -9 19 1 
T_ = 
ee) -(| 3 6 2 |+| 4 7 23 } 7 
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r 4 11 
: ol - | 5 1 | while AT + 
=" 0 
2 foe SP alte ET. 
—  ee. 4. 7 
a. 77 1 4 4 11 
-4 -6 |+]/ 9 7 |=] 5 1 4. Therule 
-9 2 1 -2 -8 O 


is true because corresponding entries of A and B are 
also corresponding entries of A? and B’. Thus the 
two numbers being added to obtain each entry of 
the sum is the same on either side of the equation. 


5: (a) A = . 2 | (b) Answers will vary. 
(AT)! = 7 es | because the inverse of the 


16b: 


17: 


transpose is the transpose of the inverse (theorem 4 
claim 8) 


Answers will vary. One demonstration relies on 
the fact that for any matrix M, M — M = 0 Gus- 
tified in an answer on page 77): A + (-A) = 
A+(-1-A) = A-—A = 0. Another demonstra- 
tion relies on distributivity (theorem 3 claim 2): 
A+(-A)=1-A+(-1-A)=(1+@CD)D)A =0A =0. 


Answers will vary. (B'A7!)\(AB) = ((B!AT!)A)B = 
(B'(A-'A))B = (B'DB = B"'B = TJ using the 
associative property multiple times, the definition 
of inverse multiple times, and the definition of the 
multiplicative identity once. The second half of the 
justification, that (AB)(B-'A7') = I, can be made 
by a similar string of equalities. You are encour- 
aged to try it. 


Section 3.2 


1b: 


3b: 


4e: 


5d: 


6d: 


|) 2 
“| 62 


i. | 
“36 | 


—136 


Z = Y-'X~'B assuming X and ¥ are invertible 


: C = 3D"(A')" — (B')? assuming A and B are in- 


vertible 


10r 
—6r 


14s 
38 


-17 


7d: The product has no third row 


-14 
sa [3 
16 
va | 'S 
Section 3.3 
8: linearly independent 
1 0 0 
: 0 1 0 
12: (a) must have 3 pivots: 001 (b) must have 
0 0 0 
10x 1 * 0 
: ; O 1 x 0 0 1 
scare [E03 | 00 0>F 
0 0 0 0 0 0 
1 0 0 0 0 1 0 1 0 
0 0 0 0 0 0 0 0 0 
0 0 O07 0 0 O07 0 0 O07 
0 0 0 0 0 0 0 0 0 
0 0 0 
0 0 0 
0 0 0 
0 0 0 
15b: x #-8 
15d: x # % 
16b x= 
16d: x =-8 
-9 7 
; - 0 0 
22b: (i) 7R:1 + 9R:2 = O (il) 7 0 +9 o | = 
0 0 
0 180 -—140 1260 
0}... —108 84 —756 
0 | 7] 92 179) 56 |=] —so4 |* 
0 —189 147 —1323 
—1260 0 
756 _| 0 
504 0 
1323 0 
23b: (i) columns | and 3 are linearly independent; and 


columns 2 and 4 are linearly independent. (ii) 
columns | and 3 are not multiples of one an- 
other and therefore are linearly independent; and 
columns 2 and 4 are not multiples of one an- 
other and therefore are linearly independent. (iii) 
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columns | and 3 are not multiples of one an- 
other and therefore are linearly independent; and 
columns 2 and 4 are not multiples of one another 
and therefore are linearly independent. 


Section 3.4 


1d: (i) no (ii) yes (iii) one (iv) one 


2c: (i) 9 (ii) yes (iii) no (iv) infinitely many (v) infinitely 
many 


3e: (ii) no (iii) no (iv) infinitely many (v) infinitely many 
7: 5 


10: (a) must have three pivots: 


100 x 10 0 
E 1 0% |] 0 1 * | 
0 0 1 x 0 0 0 1 
1 x 0 0 0 1 0 0 
E 0 1 rife 0 1 7 
0 0 0 1 00 0 1 

(b) must have 2,1, or 0 pivots: 
10% * 1 *« O x 
E 1 * *« |] 0 0 1 x |, 
0 0 0 0 0 0 0 0 
lx *« 0 0 10 * 
E 0 0 | 9 0 1 x i, 
0 0 0 0 0 0 0 0 
0 1x 0 0 0 1 0 
f° 0 0 it} a 0 0 14, 
0 0 0 0 0 0 0 0 
lk k * O lx x 
lo 0 0 OF; 0 0 0 O |, 
0 0 0 0 0 0 0 0 
001 x 0 00 1 
f° 0 0 04,; 0 0 O | 
0 0 0 0 0 0 0 0 

0 0 0 0 
f° 0 0 | 
0 0 0 0 


12b: 82 

14b: xF 8, same as section 3.3 exercise 15b 
14d: x# 28. same as section 3.3 exercise 15d 
15b: x= 2, same as section 3.3 exercise 16b 


15d: x = —8; same as section 3.3 exercise 16d 


Section 3.5 
5: (a) —38 (b) 19 (c) 38 (d) -190 
8: 2 


Section 3.6 


11: Theorem 7 part (ix) if false for G so part (v) is also 
false, from which we conclude Gv = b has in- 
finitely many solutions (in addition to the trivial so- 
lution). 


13: Theorem 7 part (xiii) is true for M so part (1) is as 
well, from which we conclude that the columns of 
M are linearly independent. Since the rows of M7 
are the same as the columns of M, the rows of M7 
are linearly independent. 


15: Theorem 7 part (x) is false for B so (a) its columns 
are linearly dependent [by theorem 7 part (i)] (b) 
By = 0 has infinitely many solutions [by theorem 7 
part (v)] (c) there is no matrix A such that AB = J 
[by theorem 7 part (xi)] 


17: (a) false (b) false (c) false (d) false (e) true (f) false 
(g) true 


Section 3.7 


8: (a) No. 
for example) and therefore not invertible. 


A need not be square (it may be 2 x 3 


(b) 


Yes. det(AA’) = (detA)(detA’) = (detA)® = 
det( : ; )=Ssoaeta = V5 #0. 


10: No. By theorem 7 M —- cl must be invertible and 
therefore det(M — cl) # 0. 


He: A - al —e 3 ok reduces to 
ae ~ so an eigenvector v must satisfy 
Vy2= — HIN y, In parametric vector form, 


Vj _ val 
V2 _3+iV3,,, 


1 
=Vi) —34iv3 [> 
q 


4 


Using generic free variables (and multiplying by 
—A4): 


ver 


an 


352 Answers to Selected Exercises 
-12 9 18 5d: All constant functions (those whose graphs are hori- 
lle: A - al = 12 -9 -18 reduces to zontal lines) 
12 -9 -18 
-4 3 6 12c: No 
0 0 0 | so an eigenvector v must satisfy 46. Yes 
0 0 0 
vy = 30> + 33. In parametric vector form, 14a: Yes 
14f: Y 
v1 du + 303 3 i eS 
v2 |= V2 =vo} 1 |+v3] O}. 15¢e: Yes 
V3 V3 0 1 
21: Yes 
Using generic free variables (and scaling to elimi- 
nate fractions): 16c: No 
3 3 17a: S U{v} = a ; = : aa ee 
2 3 4 2 
v=r| 4 ]/4+s] 0 
0 2 a aa = bel © 
3 4 0 
45-51 —24 —60 17d: SU {v} = {1,1,°,5P — 91+ 5} and -5(1) + 9(¢) - 
15 17 18 0 2 2 
a = 5(t-) + 1(5t -— 9t +5) =0 
llg: A- Al 15 17 8 20 reduces to (1°) + 1¢ ) 
-30 -34 -16 -40 21: line 1: u2+v = v for any vector v (including u,); line 
15 17 0 36 2: commutativity of addition (vector space defini- 
0 oOo 1 -2 . tion property 2); line 3: u; + v = v for any vector v 
0 00 0 so an elgenvector v must sat- (including uy) 
0 0 0 O 
isfy vy = -ty - 84, v3 = 2v4. In parametric ° 
sie Section 4.2 
v4 -y - 6y, 1b: 5 
a v2 If: 6 
V3 2v4 
V4 V4 2ec: it is not linearly independent 
17 36 
- aa 3e: it is not a subset of P3(R). 
=v2} g |tM4l 9 4b: (i) P>(R) (ii) one answer is 
0 1 5 > 
{9-41 +77, -20+ 8+ 177} 
Using generic free variables (and scaling to elimi- : . 
nate fractions): (but answers will vary since any two vectors from 
the set will do) 
5 a Sb: {-20 + 82+ 17,-11 4 42+ 24¢ \ (but answers will 
v=r +5 vary since any two vectors from the set will do, and 
0 30 . ; . ; 
0 15 it has to be different from exercise 4b); yes, this 
subset is a basis for the span 
Section 4.1 oe 2 
0 -] of: 2 
4d: The line passing through and | 
0 3 8: {b), bz, b3, bs} must be linearly independent. If it were 


4i: The xy-plane 


5a: The set of all sequences whose terms are all equal 
except possibly the third, which is twice the others 


linearly dependent, we would be able to eliminate 
one or more of the vectors without affecting the 
span, resulting in a linearly independent spanning 
set (basis) with fewer than four elements. 
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9: {b,,b2, b3, b4} must be spanning. If it were not, there 
would be a vector, call it bs, that could not be writ- 
ten as a linear combination of b; through bu, result- 
ing in a linearly independent set, {b;, b2, b3, ba, bs}, 
in R*, contradicting theorem 9. 


12d: yes—the determinant of the given matrix is nonzero 
12f: no—there are more than four columns 


13d: yes—the matrix can be reduced to the (6 x 6) identity 
matrix 


Section 4.3 


1b: yes 


: Answers will vary. Any counterexample to (vio- 
lation of) the two properties of a linear transforma- 
tion will do. For example, f(1+1) = In(i+1) = In2 
but f(1)+ fC) = In1l+In 1 = 0. f does not preserve 
addition. 


14d: We must demonstrate that the two properties of lin- 


ear transformations hold. 


1. 


- xy 47 Xp 
“LalE HS]? | 
—5 Z1 —5 22 
v1 X2 
le Iele 
Z1 £2 


15b: We must demonstrate that the two properties of lin- 
ear transformations hold. 


L(x +bix+c,)+ (anx? + box+ c2)) 
= L((a; + az)x? + (by + by)x + (c1 + €2)) 
= 6(a; + a2)x + 3(b; + bo) 
= 6a,x + 6a2x + 3b, + 3b 
= (6a,x + 3b1) + (6a2x + 3b2) 


= L(ayx? +bix+c,)+ L(anx? + box + C2) 


L (k(ax? + bx+ c)) 
= L(kax’ + kbx + ke) 
= 6kax + 3kb 
= k(6ax + 3b) 


— kL(ax- + bx+c) 


nerf 3-0 HAD 
rl)-[ | 
we rll & =o] 2-2 ]- 
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1 
19¢: | —4 } 
7 
5 
| 
=2 
4 
=f 4] 
20: u-| a i 
i 7 
Section 4.4 
0 


la: 


le: 


3d: 


3f: 


3g: 


4d: 


5d: 


6b: 


7: 


re: | 


7 


ba 


a6 
—24 282 


-172 23 33 


-1 


2 


-2 


2 
52 
62 


—135 


151 
15 


| 


-l O 
13¢: | e: | 
0 0 
14b: Wf 2 
2 2 
Section 4.5 
10e: (i) Yes. (ii) Yes. (iii) Yes. 
12b: h(| ry Tn |) = (r1,12,--+5Tn) 
12d: p(ril.y + rola ++++ + Palen) = (11, 12,---51n) 
13a: By definition, Ty : R” — R” is one-to-one pre- 
cisely when 
Mv=b 
has at most one solution for each b in R” (they are 
equivalent). Since the latter is one of the statements 
of the theorem, “Ty : R” — R” is one-to-one” may 
be added as yet another equivalent statement. 
13c: By definition, Ty : R” — R” is one-to-one pre- 


cisely when 
Mv=b 


has at most one solution for each b in R” (they are 
equivalent). By definition, Ty : R” — R” is onto 
precisely when 

Mv=b 


has at least one solution for each b in R” (they are 
equivalent). Since the latter halves of each of these 
statements appear in the theorem, the properties 
“Ty : R” — R" Is one-to-one” and “Ty : R" > 
R” is onto” may be added as additional equivalent 
statements. 


13e: Let f ¢ V x W be an isomorphism. By definition, 


f is one-to-one and onto. Therefore the equation 
f(@ = b has at most one solution for each b (by 
definition of one-to-one) and has at least one solu- 
tion for each b (by definition of onto). Since the 
number of solutions is both less than or equal to 
one and greater than or equal to one, it must be ex- 
actly one (for each b). Hence each element b of W 
has exactly one preimage a in V. In other words, 
the relation f-! C W x V contains exactly one ele- 
ment (b, a) for each b in W. Therefore the domain 
of f~' is W and f~! is a one-to-one function. That 
f~' is onto follows from the fact that f is a function 
(there is a pair (a, b) in f for every element a of the 
domain V so there is a pair (b,a) in f~! for every 
element a of V). Since f~! is a one-to-one and onto 
function, it is an isomorphism. 
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Section 4.6 


2c: 135 
2f: 113 
3a: 


3d: 


In(10 — 7) 


4b: = (x+20 m0 


J=1+ 2 toga0-m 
T 


4d: 


5a: —50 
5d: 


6d: 


7d: 


8a: 


Nw NR 


9d: 


— 
NS) 


14a: All scalar multiples of | < | 


14c: All functions whose integral over [0,1] is zero. In 
other words any function whose graph has an area 
above the x-axis equal to the area below the x-axis 
over the interval [0, 1]. 


11 


16b: 


16d: 2 


18: HINT: It fails to satisfy inner product property 2. 
Can you show this? 


Section 5.1 
le: yes 


1f: no 


2e: Answers may _ vary. One solution is 
2 —-6 
4 |?) -14 |f- 


2f: Answers may vary. One solution is {| 


“10 |} 


3f: Answers may vary. One solution is | = }\. 


3e: {} (the empty set, whose span is {0}) 


4f: (i) yes (ii) Answers may vary. One solution is 


-12 6 11 
—36 15 33 si 
a4 P| -9 [el (iii) Answers may 
60 —30 —33 
3 
vary. One solution is {| ms 
0 
4h: (i) yes (ii) Answers may vary. One solution 
-12 -8 8 —48 
—36 —20 16 —132 
is 24 |,) 24 |,} 24 |,] 156 (iii) {} 
—24 -8 -16 —84 
36 32 —72 132 


(the empty set, whose span is {0}) 


4j: (i) yes (ii) Any five linearly independent vectors in 
R° will do. For example, the columns of M or the 
standard basis. (iii) {} (the empty set, whose span 


is {0}) 
0 0 
_1 a 
6d: (i) b= -iM.2 (ii) v = oO +r] 
0 0 


13: Columns 1,3, and 5 are linearly independent. Yes, 
there are other such sets. 


15d: 1 


Section 5.2 


5b: v= 


1 
6d: v=} 1 
4 |e 


Tf: v = [12g 


2 -3 
9: (a)V= (b) v= 
=4 8B -2 8B 
1 —2 
11: wr-| 2 | wr| 1 | 
ol 8B 3 B 
-1 
(c)v=] 2 
1 Je 
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: _|6 -5 |) TF nl) 2 3e: S is not orthogonal 
[Bz es | = ; The coordinates are the 3g: S is not orthogonal 
8B 
same as those in question 9. 
3i: S is not orthogonal (Orthogonal sets of nonzero vec- 
=a OD =o tors are linearly independent, but this set is neces- 
17: [Ble=| 7 4 5 sarily linearly dependent having four vectors from 
3 3 4 R?.) 
17 1 
(a) [Blg'} 10 |=] 2 5d: 4 — 81 + 3 is not orthogonal to f° — 21. 
°) —1 
8 —2 _ | 0 
(b) [lg'} 5 }=] 1 sual le 
9 3 ‘le 
19 -1 -4 
(c) [Bz 6 |=] 2 ; The coordinates are be: Es 
7 oe 
the same as those in question 11. 7a: 
4.3 , 
21: [Blo = | ty i | 
1 
-! _~b5 7 
2 136 68 
23: [Blo = 5 -— —2 | as produced by the "pHOI,V 
; Le 8 y 
7 7 
code at 139. 
3 7 
27: (a)v= i | wy=| é 
7 |g 15 Je 
uf 3 3 t 
(c) iw |] B] =| & | wrien is oe a 
1 5 I Te: 
expected 
29: As produced by the code at ) SageMath Cell AAPOR 
5 4B 
(a) V = | -% (b) Vv = 136 (c) 
au as 
4 Ig 17 Jc 
-i _15 7 55 _~B 
oe a eee 
5 ~ 136 as “p = 6 which is ; ae 
0 -—F -TIl ® oa ZR 
[vlc as expected 
Section 5.3 Te: 
le: yes 
1 3 3 4 
If: no. V3 0353S V5 | 


2e: S is orthogonal 


2i: S is not orthogonal (Orthogonal sets of nonzero vec- 
tors are linearly independent, but this set is neces- 
sarily linearly dependent having four vectors from 
R’,) 


pro]gv 
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8a: projyy =| 3 1 l' . 


8c: proj, Vv = [ —2 


|. Multiples of these vectors also 


12e: 


12g: 


12i: 


13¢e: 


14a: 
14e: 
14e: 
14g: 


14i: 


15a: 


15¢: 


15e: 


suffice, for example. Actually, any orthogonal set 
of two vectors, such as the standard basis, will do 
since the span of the given set is R?. 


Answers may vary. One answer is 


-1 3 
| 2 r 0 |} Multiples of these vectors also 
1 3 


suffice, for example. 


Answers may vary. One answer is 


2 1 9 
1 },] 2 |,) —6 |>. Multiples of these vec- 
4 -1 -3 


tors also suffice, for example. Actually, any or- 
thogonal set of two vectors, such as the standard 
basis, will do since the span of the given set is R*. 


Answers may vary. One answer is 
—2 35 685 29174 
—4 109 -377 -7042 
7 |} 53 |*] -369 |") 13581 of: 
3 45 815 —21629 


shown at 141. Multiples of these 
vectors also suffice, for example. Actually, any 


orthogonal set of two vectors, such as the standard 
basis, will do since the span of the given set is R*. 


As shown at & Sagelath Cell py 


——— 
—=——S 


—a—_ oo oases ee 
| eee, (—aeeny 
| 
al-aly|- 
. - 
$l- o8l- 1 
nN 
eS S$]¢ Sl-sl- 
—— 
———’/ 


7 ae 3 
va Jl v6 ~via 
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—2_ eae nes 2 -69 192 
8 2 V4485 V 
ae 109 ee ir 17g: Orthogonalized: 3 |,| —401 |,] -30 |}; 
15i: vO 1 20 1) ee 4) | 2a“, | a7 
a as yee? Orthonormalized: 
V28 2 V4485 6 v39215 2° ___ 69 192 
58 v70 V461195 39531 
Pata 3 | | —_401 _ 30 
_ 14 V70 V461195 | 39531 
ba ee VI0 “V461195 ~ V39531 
ea 17i: Orthogonalized: 
3 Vo82 
-7 1755 216 0 
16e: S is orthogonal as-is. Normalized: 0 1168 195 0 |{h- 
6 ae 9 455 56 0 
ea = aye Orthonormalized: 
y46 2 yi22 a 1755 216 
Wie Vin VB 2V 1607387 132114 
3 "| -vieg73s7 |’ Vig2ii4 
16g: S is orthogonal as-is. Normalized: VB 2 Vi607387 Vis2ii4 
2 1 64 
Ve “Vis Vays 20: {1,¢-1, P — 20+ 5} 
Bf) ys || Es 
vs ‘Vi65 YATES Section 5.4 
[Technically, any orthogonal set of at least 3 vec- 1 
tors will suffice since any such set spans R*. We 1c: Answers may vary. P = I | P''MP = 
have not been asked to use the orthogonalization i 0 
procedure.] 
0 -l 
16i: Following the orthogonalization process leads to a 
-7 9 288 0 le: Answers may vary. P=| -1 1 —-1 |; PIMP = 
O |.) 8 }|,}| -520 ],] 0 2 3 #1 
9 7 224 0 -8 0 0 
: 0 -20 O 
[Technically, any orthogonal set of at least 3 vec- 0 0 -16 
tors will suffice since any such set spans R*. We 
have not been asked to use the orthogonaliza- 0 1 a) 
tion procedure.] The zero vector cannot be nor- 3 3 1 -3 
malized so must not be included in the orthonormal 1? Answers may vary. P = | _, _, _» o - 
set. Normalizing the other three vectors: 1 2 -1 1 
a A a Bes p-'MP = 0 21 0 O 
5 ; vi94 , Y¥g305 0 0 -7 O 
‘VI30 “Vi94 6305 0 0 O -7 
[Technically, any orthonormal set of 3 vectors, 2c: —1,1 
such as the standard basis, will suffice since any a 
such set spans R?. We have not been asked to “© ~“)~*9 
use the orthogonalization procedure. ] 2i: -7,7,21 
6 “19 3a: no 
17e: Orthogonalized: -1 |,J -21 |}; 
3 8 3h: no 
6 19 
V6 = 135 4d: k=0 
Orthonormalized: =e 5 = 
we aia 5d: Answers will vary. P = ae | 
VO65 V1435 1 5 
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abe 1g: Answers may vary. One solution is L = 
10c: Answers will vary. P=| d e f | where 5 0 0 5 2 
g hi -4 -21 0};U=]0 1 
1 -6 1 0 0 
25698 319807 50820 
~~ 97469"! bs 97469 2 9746973 li: Answers may vary. One solution is L = 
1 O 0 -6 8 4 
_ __ 16879 et 1996453 _ 317321 | -l 1 Of};;U=] 0 12 O 
~ 389876 | 389876 * 389876 ° , -2 i 0 0 -12 
oe ae 01 
- 326995 7 319057801 : 50321237 2a: Answers may vary. One solution is P = 1 0 | 
~ 389876 | 389876 > ~—« 389876 b=| 1 0 | v= 1 7 
, _ 13066 12604743 1988259 =e Ail, . 2 
~ 97469 | 97469 >| 97469 ° 2e: Answers may vary. One solution is P = 
1244091 1215417989 : : : : : : : , 
= r| 12 : = : = 
389876 389876 001 D2 1 
1 
_ Zee lat 12 0 
389876 0 6 -4 
§ =73 0 O -7 
h=r 3a: detM =3-1-1-1=3 
4l 1106915 1659967 3e: det WM =1-1--2-40 = -40 
= -——— 1 + ——— fr), - ——— 73 2 
389876 389876 389876 . : 
g: M is not square so has no determinant 
evs 3i: det M =1-1-4.-6-12--12 = 432 
mk. 4a: det M =-1-1-1-1-25=-25 
12b: Reflection across the line y = 5X parallel to the y- de: detM=-1-1-1-1--1-6-—7 = —42 
axis. 1 0 35 
cos @ aS a ae Oe oe 0 1 
12c: Scaling by a factor of 2 in the direction of ae 
: 1 0 —2 -6 
and by a factor of 3 in the direction of | an gle =2 | ue | 0 20 | 
cos @ 2 
‘ 1 0 0 -25 10 
1 1 2 0 1 -1 -pu| 4 = 
- ye 5 1 5g: L= z= 1 O};UH= QO -2!1 
rete Parise : | “12 4 0 0 
6)! 4)! 6\' , (4\7 
=| (3) +7 3) | A 1 0 0 6 8 4 
8) _4(6 4 6 4 Si: L=| -1 1 Of,;U=] 0 12 O 
7(3) +7(3) 7(§) +(3) ' 2 35 . 
5 =a 1 0 O -6 
Section 6.1 epee: Sloe se 
6a: L= 9 1 f U= 0 1 
la: Answers may vary. One solution is L = o | 2 0 1 3 
“3 1) 6e: L= = 
3 5 5 20 0 1 
ufo 4 
é. 4 0 1 6 6 
1 0 “| 1 -36 0 1 ; 
le: A .O lution is L = : 
c: Answers may vary. One solution is -3 1 | BB a \ 2 
U= -2 -6 6g: L=| -20 -21 0];U=]0 1 
“1 0 40 5 -6 1 0 0 
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-6 0 0 1 -$ -3 
6i: L=} 6 12 0 f,;U=]0 1 0 
-9 -18 -6 0 0 1 
2 
7b: v= 13 | 
2 
7f: v=| 1 
—2 
1 1 
9b: M7! -| “2p 3H | 
tH 
-4 _2 1 
“1 ie ae 
9d: M7 = 7 0 4 
<3 af) 2 
—33 _15 | 
mar] 4 | 
29 26 of 
2 2 
Section 6.2 
1d: 7 
If: —6 
lh: —24 
2d: no 
2f: no 
3c: Answers may vary. One answer is 3.996, ay | 
3e: Answers may  vaty. One answer is 
10654688 
83.06,| 16132160 
—10084800 


4c: veg through vj); are 
1 17 49 833 
0 }?} -8 }?] O |?} -392 


giving two rather different directions, 


2.125 
-1 
verge. 


de: veg through vj; are 


1 -11 81 —891 
1 },j -14 |,] 81 ],) —1134 
1 -13 81 —1053 


1 
0 


| 


and 


, so it seems the method will not con- 


5¢c: 


5e: 


6b: 


Tb: 


giving two rather different directions, 


1 
1 | and 
1 


11 
| 14 so it seems the method will not converge. 
13 


—13570337 896471030 
both pointing in approximately the direction 


—53291821 3533661079 
Vio =| 63954728 | and vi, =| —4085829332 |, 


1 
| —1.2 |, so it seems the method will converge. 
0.25 


= 
— 
oOo 
SS 
I WIND) 
— 
a 


(i) —12 (ii) 2 (iil) yes, [ -39 8 -.98 1 ' 
(iv) Answers may _ vary. With vo = 


T 
[ 8 -3 2 -7 | yes, it produces the same 
eigenvector as before. 


yes, it works 


8: (a) 390 is the dominant eigenvalue of M (b) the 


10a: 


10c: 


10e: 


10g: 


eigenvector corresponding with eigenvalue 390 is 
T 
[ 1 11éi1 | (c) the power method will work 


eigenvalues 33 and 11. It seemed the method would 
converge, and this is because M has a dominant 
eigenvalue. 


eigenvalues 7 and —7. It seemed the method would 
not converge, and this is because M does not have 
a dominant eigenvalue. 


eigenvalues 9 and —9. It seemed the method would 
not converge, and this is because M does not have 
a dominant eigenvalue. 


eigenvalues —39,-52, and —65. It seemed the 
method would converge, and this is because M has 
a dominant eigenvalue. 
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Section 6.3 
4: 2 
5: land 2 
9b: 
1.5 
-0.5 0.5 1 1.5 
ae 3 

lic: Answers may vary. One solution is P = 0 9 


1lf: Answers may vary. 


2 0 0 
4 6 0O 
-1 -5 1 

Section 6.4 

1d: By 

If: sy 

._4 

th: -wv 

2d: 11 

2f: af 1082 = 9.123 


2h: 


3a: 


3e: 


3g: 


4d: 


1745 
245 w 2.946 


— 106 
29 
0 
265 


£| 


One solution is P 


| 


6 
10 
4f: 0 
1 
5d: no 
6d: 
7 7 1375 
| | | e | | | 
7 |= = |+| = 
97 7 
47 2375 
12 T94 94 
_ AT 1375 
an ae 32 |. «. 
where al is in W and 97 «| isin wt 
_ AT 2375 
194 194 


7c: The system is consistent: 


x —661 
y |= 5 —337 
z 130 


7e: The system is consistent: 


er be? 


7g: The system is inconsistent. The best approxima- 


10 
tion of b = | 4 | asa linear combination of the 
9 
7045 
columns of the coefficient matrix is sbi . One 
tosh 
. . rs . 1358 . . 
particular linear combination (best approximation 
to a solution) is x = — 2, y= -3. z= 0. An- 


swers may vary. There are infinitely many others. 


“le 


10: See the solution of exercise 3e for a hint. 


12: See the solution of exercise 8b for a hint. 


15a: — 182 (-6x" +2x-1) 


15e: — 31 (11x? + 3x- 1) 


+ 303 (—4x3 + 8x? + 17x - 28 


_ 2012.3 , 193305 2 , 1029134. 1814655 
979 © + Tagz49% + 


128249 * — 728249 


15e: - 133 (Sx? - 11x + 14) + Gt (4x? - 12x - 5) 
_ _ 153472 _ 30915. 370406 
~ 19723 19723 19723 


16b: B (-3x? + 3x- 1) 
— 859. (-14x3 + 9x2 — 3x4 25) 


3382 
— 6013 3 _ 281802 ,2 4. 217377 ,. _ 330159 
~ 1691 42275 42275 42275 
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17a: no 
© w= prop = 222 (7x3 2_8y~ 
18a: Ww = projyP = 3 (7x + 8x" — 8x 1) 
— 36403 , 41602 _ 4160, 520 
857 857 


a7 * 645 3°. 5017°) , 6731 
Be aes el _ 
wo = PW = 357 857 857 


2234 
857 


xX+ 


19b: {13 — 6x + 9x? — 6x3, -3+.x—-3x2 + 2x3} 


Section 7.1 
1b: nonlinear 
1d: linear 

1f: linear 

lh: linear 


2a: The general shape of the graph is parabolic, so one 
might try a model of the form f(x) = By +81 x+B2x" 


2c: The general shape of the graph is logarithmic, so one 
might try a model of the form f(x) = Bo + 8; Inx 


2e: The general shape of the graph is exponential, so one 
might try a model of the form f(x) = By) +f; ( v2)". 
See question 6 for further discussion. 


3b: g(x) = 2.90418 + 6.06618x 


50.3321176x 


3f: &(x) = jae 


4b: g(x) > 2.90418 + 6.06618x 


50.3321176x 
l+e! 


4f: €(x) = 
5b: ||Mv — bil? ~ 114.654 
5f: ||Mv— bl? ~ 13786.1 


6: (a) 


t | .6203 
3.524 
t | 3.147 
2.293 


1.062 
3.081 
8.259 
1.036 


1.625 
2.911 
8.931 
1.106 


2.158 
2.684 
9.519 
.7467 


(b) f(t) = 3.314 — 0.2662r (c) a = 27.49 
(d) y(t) = 27.49e~9-2662" 


Section 7.2 


ld: no 


If: yes 


2p: QEESEMS 143 
0.803971 0.175122 
$ 2° = 
ae = eee 0.824878 
0.208450997 0.707128254 
0.791549003 0.292871746 
0.196029, 0.791549003 (iii) 40.791549003 + 
10.292871746 = 0.5422103745 (iv) vi = 
0.328171 | |_| 0.585760397 
0.671829 | °? ~— | 0.414239603 
ee | (v) 0.618507994821 


| ana M? = 


| Gi) =: 0.947, 


» V3 


0.618507994821 


2e: QEEIETSD 144 
| 0.2015 0.281044 ‘sea | 


(i) M? = 0.52332 0.484799 0.54607 


0.275172 0.234157 0.185366 
0.529767 0.498964 0.508049 | (ii) 
0.209172 0.248326 0.227327 


0.21, 0.52332, 0.52976784 (iii) 10.5297678 + 
10.49896412 + 40.5080499 = 0.512260651 (iv) 


ass 0.252708 0.264622 
M? = 


0.355 0.281044 
v= | 0.637 Vv. = jose v3 = 
0.008 0.234157 
0.25270888 
0.498964 123 | (v) 0.498964123 
0.248326997 


3: Because the have the same characteristic equations. 
Can you prove it? See crumpet 32. 


257969 
509670 


846 
947 


4b: span { 


de: span | 


233637 
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sb: @EEIETED 145 


M2 = 0.472150 0.471552 
0.527849 0.528447 
M* are approximately ww times the answer from 
1 | 846 
1793 | QA7 
of the entries in each column of M°2, 0.47 : 0.53, is 
approximately equal to the ratio of the entries in the 
eigenvector, 846 : 947 (so the vectors are approxi- 
mately multiples of one another). The columns of 
M>* are (nearly) in the eigenspace of M. 


| the columns of 


question 4b: . In other words, the ratio 


se: DEITIES 146 


0.233339 0.233339 0.233339 


columns of M*? are approximately aH times the 


0.257640 0.257640 0.257640 
M** = | 0.509020 0.509020 0.509020 |; the 


257969 
answer from question 4b: | 509670 In 

233637 
other words, the ratio of the entries in each col- 
umn of M*, 0.257 : 0.509 : 0.233, is approxi- 
mately equal to the ratio of the entries in the eigen- 
vector, 257969 : 509670 : 233637 (so the vectors 
are approximately multiples of one another). The 
columns of M*? are (nearly) in the eigenspace of 
M. 


op: t.| 846 |. | 0.471834913552705 
* 1793) 947 |~| 0.528165086447295 
257969 0.257640251039673 
Ge: a--| 509670 | ~} 0.509020489854945 
233637 0.233339259105381 
1 
iG eS 
3 
E@) 5 4} 4 5 0 (b) QEESEED 147 0, 
O33 9 0 
00433 1 
8 55330 


5. 37> S599 © 0.937 (c) if the game can be won at 
all, it will eventually end 


9; QEESETSD 113 ome| 35.286 326 


375.214 2 | 


275 = 348 
(b) the consumption of each sector in dollars (c) 
26.706 
Mv = | 30.560 the farming sector consumes 
42.734 


(26.706) more than it produces (10); the build- 
ing sector produces (57) more than it consumes 
(30.56) (c) because 1 is an eigenvalue of ev- 
ery transition/consumption matrix (d) Any mul- 


1 
6d: do = —~, Gn = 


0.814670795745254 
tiple of | 0.85593 1062340110 for the econ- 
1 
omy in part (c), where the total economy is 100 
($100, 000), the “everybody is happy” vector is 
30.505 1385057193 
| 32.0501185809012 | example 
37.4447429 133795 


Section 7.3 


1: Verify the 10 properties of a vector space as in section 


4.1 


3: because sin (m1) = sin(0) = 0 when m = 0 


2mn 


Be ee = eee _yyntl 
“Gye ye) 


m 


2in2__(9(-1)"—1) form = 


(In 2)*-+m22 


(ii) 
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(iii) 


8d: (i) 2? 


(ii) 22 


(iii) 


9b: (i) 


(ii) 


2.2 
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(iii) 


9d: (i) 22 
2 
18 
+6 
14 
12 
1 
0.8 
0.6 
0.4 
0.2 
Tor 0) 04 07 
0.2 
0.4 
(ii) 2 
+8 
1.6 
14 
1.2 
1 
0.8 
0.6 
0.4 
0.2 
01 02 03 04 05 06 OF 


(iii) 2 


—> 
0.1 0 0.1 02 #03 04 05 06 0.7 
-0.2 


-0.4 


10a: All four graphs are the same, equal to f(x) = | 


Section 7.4 
-. _f 3/5 |_| 06 
mer -| -7/10 |-| -0.7 | 
_ | 153/50 | _ [ 3.06 
v2) 63/20 |~ | 3.15 
_f 21/20 ]_[ 1.05 
¥3 1119/1000 |~ | 0.119 
y, =| 12429/5000 | _ | 2.4858 
4*] 22743/10000 |~ | 2.2743 
3/5 0.6 
1j: v=] 3/8 |=| 0.375 
-7/10 07 
359/80 4.4875 
v2 =| -1791/400 | =| -4.4775 
1171/400 2.9275 
27199/4000 6.79975 
v3 =| -29553/4000 | =| —7.38825 
20751 /4000 5.18775 
319787/40000 7.994675 
v4 =| -355527/40000 | =| -8.888175 
254891 /40000 6.372275 
1/7 0.1428 
In: vy; =| 2/7 | =| 0.2857 
3/7 0.4285 
1481/196 7.556 
vo =| -431/196 |~] -2.198 
603/196 3.076 
1473/112 13.15 
v3 =| -3021/784 | ~| -3.853 
3513/784 4.480 
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60217/3136 19.20 
v, =| —17599/3136 | ~| —5.611 
17811/3136 5.679 
se: x =| 354/187 ] | 1.893 
& X=!) 950/187 | ~| 1.385 
13247/1440 9.199 
2i: x=] 499/48 |~]} -10.39 
10907/1440 7.574 
2n: none 
3a: -2,1 


4j: yes 
4n: no 
5e: approaches attractor 


5n: tends toward infinity 


6: (a) the eigenvalues of M are 2 and 2 so the spectral 


radius is 3, which is less than | 


(b) ™. 


| 


6 8 


9: (a) the eigenvalues of M are 2 (3 + i) and 
2(y3 + i), which have the same magnitude: 


Bi v549- 25-0 yie+p- V2, 


which is greater than | 


10 12 


(b) 


Section 7.5 


1d: f(x) = :| x 


0 
If: ro =| i |x+| i | 


Ih: f(x) = 4 | 


6 Fle[3I 


2b: 


2h: 
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3b: Two copies of this shape can be fitted together to 
form a parallelogram (see the solution to exercise 
2b above). Such parallelograms can be pieced to- 
gether to tessellate the plane, thereby tessellating 
the plane with this shape. 


3d: All triangles tessellate the plane. Two congruent tri- 
angles can always be put together to form a paral- 
lelogram. Such parallelograms can be pieced to- 
gether to tessellate the plane, thereby tessellating 
the plane with the triangle. 


3h: Four copies of this shape can be fitted together to 
form a parallelogram (see the solution to exercise 
2h above). Such parallelograms can be pieced to- 
gether to tessellate the plane, thereby tessellating 
the plane with this shape. 


7b: There are 5 congruent parts, so we must have 5s* = 1 
and therefore the scale factor is i for each part. 


L° 0 1 
. _| 2 
9b: {reo = 0 1 x+ 1 
aan) 2 
=|) 2 
T2(x) = | 0 l x+ 0 } 
-! 9g 2 
_ 2 
T3(x) = | 0 l x+ 0 ; 
0 i 0 
= 2 
T4(x) = = 0 x+ > } 
ee 
of: {rie -| 1 1 |x, 
2 2 
1 _41 5 
Tox)=| 7 7 |x+ } 
3 3 0 


9h: Translations will vary based on placement of the 
axes. Placing the origin at the lower left corner of 
the rep-tile with scaling so that the bottom side is 2 
units long, the IFS is 


1 
5 «COO 
{rio =| § 1 |X» 
2 
5 (0 1 
T2(x) = 0 2 [Xt of 
2 
-i _l 2 
T3(x) = 4 ic xt+ 1 | 
2 


9k: Translations will vary based on placement of the 
axes. Placing the origin at the center of the rep-tile 
with scaling so that the distance between nearest 
centers of the parts is 1 unit, the IFS is 


{reo =| | |x. 


NI- © 


at = 
T19)=| D |x 4 ‘al 
2 
-1 9 1 
70) =| 2 4 |x+ aa ; 
2 “ar 
_1 9 L 
T4(x) = 0 1 |x+ oe } 
2 mes 


10: In the form of the required compositions, the four 
transormations (see the solution of exercise 9b) are 


none, 5, 0, 1,1 
none, 5, 0, 2,0 

1 
A Oe 
none, 5, —90, 0, 2 


y-axis 


Screenshot of the Rep-Tile Designer with these pa- 
rameters: 


Tea Time Linear Algebra 
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1 2 3 4 5 6 7 
Reflection: sore ~ |none v y-axs 
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Rotation: © jo 0 | -90 
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Translation: 
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[s_s[s si[s 


0.500000 0.500000 0.500000 0.500000 


i lo 


Dissection level: |1 » | Fixed points: | off + 


10: In the form of the required compositions, the four 
transormations (see the solution of exercise 9f) are 


1 

none, —5, 45, 0, 0 
1 

none, —5, 45,5, 0 


Screenshot of the Rep-Tile Designer with these pa- 
rameters: 
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Dissection level: |1 »| Fixed points: off + 


none, 5, 0, 0, 0 


none, 5, 0, 1,0 


- 
X-aXxls, 5, —135, 2,1 


Screenshot of the Rep-Tile Designer with these pa- 
rameters: 
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Index 


addition 
matrix, see matrix, addition 
property of equality, 79, 82 
table, 72 
vector, see vector, addition 
adjugate, 35, 37 
affine transformation, see transformation, affine 
algebraic properties of matrices, see matrix, algebraic 
properties 
analysis, 7/ 
area, definition of, 2/0 
ASCII, 36, 37 
attractor, 261, 265 
augmented matrix, see matrix, augmented 


basic variable, see variable, basic 
basis, 125, 127 
change of, 167-169, 170 
orthogonal, 179, 219 
standard, 127, 127 
best approximation, 219, 221, 228, 248 
binomial coefficient, 96 


Cardano, Gerolamo, 20/ 
Cartesian product, 130, 132 
Cauchy, Augustin-Louis, 7/ 
change of basis, see basis, change of 
characteristic equation, 44, 44 
characteristic polynomial, 44, 44 
characterization of matrices, see matrix, characteriza- 
tion 
codomain, 130, 132 
coefficient, 28, 30 
binomial, see binomial coefficient 
Fourier, see Fourier, coefficient 
matrix, see matrix, coefficient 
cofactor, 27, 30 
column space, 159, 162 
consistent linear system, see linear system, consistent 
coordinate vector, 167—169, 169 
cryptography, 35 
cubic formula, 20/ 
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decomposition, 199 
Dedekind, Richard, 7/ 
del Ferro, Scipione, 20/ 
dependence, linear, see linear dependence 
determinant, 27—30, 30 
algorithm, 106, 107 
and area, 210, 214 
and eigenvalues, 2/7, 211, 214 
and volume, 213, 214 
by expansion, 97, 97, 103 
of inverse, 110, 112 
of lower triangular matrix, 92, 93 
of product, 110, 112 
of replacement matrix, 96, 102, 103 
of scale matrix, 96, 102, 103 
of swap matrix, 96, 97, 98, 102, 103 
of transpose, 102, 103 
of upper triangular matrix, 93, 94 
diagonalization, 187 
dilation, 270, 273 
dimension, 126, 127 
discrete dynamical system, 256-264, 264 
discriminant, 27 
distance, 23, 154, 154 
domain, 130, 132 
dominant eigenvalue, see eigenvalue, dominant 
dot product, 14, 16, 21, 153 


echelon form 
reduced row, 55, 60, 67, 160 
row, 55, 60, 62-63 
eigenpair, 42-44, 44, 75 
eigenspace, 159, 162 
eigenvalue, 42, 44, 75 
dominant, 204, 205, 264 
eigenvector, 42, 44, 75 
by row reduction, 111, 112 
elementary matrix, 51, 52, 103, 140, 144, 210 
determinant, 96-103 
inverse, 95-96 
elementary row operation, 51, 52, 163 
replace, 51, 52, 55 
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scale, 51, 52, 55 iterate, 261, 264 
swap, 51, 52, 55 iterated function system, 273 
equivalent statements, 87, 88 iteration, 202, 261, 264 
existence, see linear system, existence and uniqueness 
of solutions kernel, 132, 132 
factorization, 199 Landau, Edmund, 7/ 
LU, 195-198, 199 least squares regression, see regression, linear 
diagonal, 187 linear combination, 28, 28, 30 
triangular, 214 nontrivial, 86, 88 
Fejér, 249 linear dependence, 86, 88, 126, 127, 160 
Ferrari, Lodovico, 20/ linear equation, 49, 52 
field, /0 solution, 49, 52 
fixed point, 261, 265 linear independence, 85-87, 88, 125, 179 
Fourier linear system, 49, 52 
analysis, 254 characterization of solutions, 162, 162 
coefficient, 252, 254 consistent, 63, 67 
cosine series, 252, 254 existence and uniqueness of solutions, 65, 67 
series, 246-253, 253 general solution, 64, 162 
sine series, 252, 253 homogeneous, 59, 60 
free variable, see variable, free homogeneous solution, 64 
function, 130, 132 inconsistent, 63, 67 
matrix form, 80-82, 82 
Gardner, Martin, 270 nonhomogeneous, 59, 60 
Gershgorin circle theorem, 239 particular solution, 64, 162 
Golomb, Solomon, 209, 270 solution, 49, 52 
linear transformation, 131-132, 133, 143, 210 
harmonic, 252, 254 characterization, 143, 144 
Hill, Lester S., 37 geometric interpretation, 137-143 
homogeneous linear system, see linear system, homo- long term behavior, 264, 261-264, 265 
geneous lower triangular matrix, see matrix, lower triangular 
Hutchinson’s theorem, 27/, 271, 273 LU factorization, see factorization, LU 


hyperspace, 213 
main diagonal, 33, 38 


identity, see matrix, identity map, 131, 132 
IFS, see iterated function system mapping, 131, 132 
image, 132, 132 contraction, 273 
set, 210, 213 Markov chain, 239, 237-242, 242 

inconsistent linear system, see linear system, inconsis- matrix, 3, 3-4, 5 

tent addition, 9, 11 
independence algebraic properties, 73, 74, 71-75, 75 

linear, see linear independence augmented, 59, 60 

induction, 7/, 92, 93 characterization of, 86, 88, 91, 86-93, 93, 106, 
inheritance, 29 106-107 
initial condition, 259, 264 coding, 35 
inner product, 153, 154, 248 coefficient, 59, 60 
inner product space, 153-154, 154 column, 13, 16 
instantiation, 5 diagonalizable, 188 
inverse function, 130, 132, 132 division, see matrix, inverse 
inverse matrix, see matrix, inverse elementary, see elementary matrix) 
inverse relation, 130, 132 entry, 3, 4,5 
invertible function, 132 equality, 4, 5 
invertible matrix, see matrix, invertible identity, 33, 37 
isomorphic, 149 inverse, 34, 33-37, 38 


isomorphism, 148-149, 149 inverse algorithm, 110, ///, 112 
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invertible, 35, 38 Peano, Giuseppe, 7/ 
lower triangular, 92, 93 permutation matrix, see matrix, permutation 
multiplication, 13—15, 16, 80, 81, 82 perpendicular vectors, 21 
nonnegative, 239 pivot, 55, 60 
notation, 5 column, 55, 60 
permutation, 197, 199 position, 55, 60, 63 
positive, 239 power method, 204, 201-205, 205 
powers of, 186, 188 preimage, 132, 132 
product, see matrix, multiplication product 
projection, 142, 144 Cartesian, see Cartesian product 
row, 13, 15 dot, see dot product 
similar, 188 inner, see inner product 
size, 3,5 matrix, see matrix, multiplication 
square, 27, 30 scalar, see multiplication, scalar 
standard, 139, 138-139, 144 projection, 142 
stochastic, 239 orthogonal, see orthogonal, projection 
subtraction, 9, 11 projection matrix, see matrix, projection 
transition, 241, 242 proof technique 
transpose, 14, 16 contraposition, 87 
upper triangular, 92, 93 equivalent statements, 87 
zero, 73,75 induction, 92, 93 
multiplication real numbers are equal, /06 
matrix, see matrix, multiplication 
property of equality, 79, 82 quartic formula, 20/ 
scalar, 10, 11 quintic polynomial, 201 
nonhomogeneous linear system, see linear system, non- range, 132, 133 
homogeneous rank, 159, 162, 163 
nontrivial solution, see solution, nontrivial recurrence, 259, 264 
norm, 154, 154 reduced row echelon form, see echelon form, reduced 
normal equations, 232, 232-233, 233 row 
normalize, 179 regression 
notation linear, 227-233, 233 
function, 130 multilinear, 231, 233 
matrix, see matrix, notation multiple linear, 231, 233 
vector space, see vector space, notation relation, 130, 132 
null space, 159, 162 recurrence, 264 
nullity, 159, 162, 163 rep-tile, 268-272, 273 
repeller, 265 
one-to-one, 148, 149 ring, 15 
onto, 148, 149 row echelon form, see echelon form, row 
operator row reduction, 55-59, 60 
binary, 9, 10 automated, 55 
orbit, 261, 264 
orthogonal, 22, 23, 154, 154, 218, 221 SageMath 
basis, see basis, orthogonal charpoly(), 44 
complement, 218, 221 coersion, 23 
decomposition, 279, 221 determinant, 30 
projection, 176, 179, 218, 221 dot product, 17 
set, 178, 179 echelon_form(), 66 
orthogonalization, 176-178, 179, 232 eigenvalues(), 44 
orthonormal set, 178, 179 eigenvectors_right(), 44 
inverse(), 38 
parametric vector form, 67 magnitude, 24 


partial pivoting, 197, 199 matrix(), 6 
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nested statement, 7 
norm(), 24 
operators, | 1 
rref(), 66 
SageCell screenshot, 6 
submatrix, 6 
transpose(), 16 
vector(), 23 
scalar, 10, 11 
product, see multiplication, scalar 
set 
bounded, 273 
closed, 273 
compact, 273 
image, see image, set 
orthogonal, see orthogonal, set 
orthonormal, see orthonormal set 
solution, see solution, set 
spanning, see spanning set 
similarity, 186, 186-188, 188 
similitude, 270, 273 
solution 
least squares, 233 
linear equation, see linear equation, solution 
linear system, see linear system, solution 
nontrivial, 82, 82, 86, 88 
set, 62-66 
trivial, 86, 88 
solution space, 159-162 
span, 119, 120, 125, 127 
spanning set, 125, 127 
spectral radius, 263, 265 
standard basis, see basis, standard 
standard matrix, see matrix, standard 
submatrix, 4, 5 
subspace, 119, 120 
subtraction 
matrix, see matrix, subtraction 
vector, see vector, subtraction 
sum of squared errors, 228, 233 


Tartaglia, Niccolo, 20/ 
transformation, 131, 132 
affine, 213, 214, 264 
rigid, 270, 273 
transpose, see matrix, transpose 
triangularization, 2/2, 214 
trivial vector space, see vector space, trivial 


uniqueness, see linear system, existence and unique- 
ness of solutions 

unit vector, see vector, unit 

upper triangular matrix, see matrix, upper triangular 


variable 


basic, 65, 67 
free, 65, 67 
vector, 14, 16, 118, 120 
addition, 21, 22 
column, 15, 16 
coordinate, see coordinate vector 
geometric interpretation, 20, 22, 143 
magnitude, 21, 22 
orthogonality, see orthogonal 
representation, 147 
row, 1/5, 16 
subtraction, 22, 23 
unit, 178, 179 
zero, 43, 44 
vector space, 119, 117-120 
notation, 120 
trivial, 126, 127 


Yanghui triangle, 49 


zero matrix, see matrix, zero 
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